1#############################################################################
2# Pod/Checker.pm -- check pod documents for syntax errors
3#
4# Copyright (C) 1994-2000 by Bradford Appleton. All rights reserved.
5# This is free software; you can redistribute it and/or modify it under the
6# same terms as Perl itself.
7#############################################################################
8
9package Pod::Checker;
10use strict;
11use warnings;
12
13our $VERSION = '1.73';  ## Current version of this package
14
15=head1 NAME
16
17Pod::Checker - check pod documents for syntax errors
18
19=head1 SYNOPSIS
20
21  use Pod::Checker;
22
23  $syntax_okay = podchecker($filepath, $outputpath, %options);
24
25  my $checker = Pod::Checker->new(%options);
26  $checker->parse_from_file($filepath, \*STDERR);
27
28=head1 OPTIONS/ARGUMENTS
29
30C<$filepath> is the input POD to read and C<$outputpath> is
31where to write POD syntax error messages. Either argument may be a scalar
32indicating a file-path, or else a reference to an open filehandle.
33If unspecified, the input-file it defaults to C<\*STDIN>, and
34the output-file defaults to C<\*STDERR>.
35
36=head2 podchecker()
37
38This function can take a hash of options:
39
40=over 4
41
42=item B<-warnings> =E<gt> I<val>
43
44Turn warnings on/off. I<val> is usually 1 for on, but higher values
45trigger additional warnings. See L<"Warnings">.
46
47=item B<-quiet> =E<gt> I<val>
48
49If C<val> is true, do not print any errors/warnings.
50
51=back
52
53=head1 DESCRIPTION
54
55B<podchecker> will perform syntax checking of Perl5 POD format documentation.
56
57Curious/ambitious users are welcome to propose additional features they wish
58to see in B<Pod::Checker> and B<podchecker> and verify that the checks are
59consistent with L<perlpod>.
60
61The following checks are currently performed:
62
63=over 4
64
65=item *
66
67Unknown '=xxxx' commands, unknown 'XE<lt>...E<gt>' interior-sequences,
68and unterminated interior sequences.
69
70=item *
71
72Check for proper balancing of C<=begin> and C<=end>. The contents of such
73a block are generally ignored, i.e. no syntax checks are performed.
74
75=item *
76
77Check for proper nesting and balancing of C<=over>, C<=item> and C<=back>.
78
79=item *
80
81Check for same nested interior-sequences (e.g.
82C<LE<lt>...LE<lt>...E<gt>...E<gt>>).
83
84=item *
85
86Check for malformed or non-existing entities C<EE<lt>...E<gt>>.
87
88=item *
89
90Check for correct syntax of hyperlinks C<LE<lt>...E<gt>>. See L<perlpod>
91for details.
92
93=item *
94
95Check for unresolved document-internal links. This check may also reveal
96misspelled links that seem to be internal links but should be links
97to something else.
98
99=back
100
101=head1 DIAGNOSTICS
102
103=head2 Errors
104
105=over 4
106
107=item * empty =headn
108
109A heading (C<=head1> or C<=head2>) without any text? That ain't no
110heading!
111
112=item * =over on line I<N> without closing =back
113
114=item * You forgot a '=back' before '=headI<N>'
115
116=item * =over is the last thing in the document?!
117
118The C<=over> command does not have a corresponding C<=back> before the
119next heading (C<=head1> or C<=head2>) or the end of the file.
120
121=item * '=item' outside of any '=over'
122
123=item * =back without =over
124
125An C<=item> or C<=back> command has been found outside a
126C<=over>/C<=back> block.
127
128=item * Can't have a 0 in =over I<N>
129
130You need to indent a strictly positive number of spaces, not 0.
131
132=item * =over should be: '=over' or '=over positive_number'
133
134Either have an argumentless =over, or have its argument a strictly positive number.
135
136=item * =begin I<TARGET> without matching =end I<TARGET>
137
138A C<=begin> command was found that has no matching =end command.
139
140=item * =begin without a target?
141
142A C<=begin> command was found that is not followed by the formatter
143specification.
144
145=item * =end I<TARGET> without matching =begin.
146
147A standalone C<=end> command was found.
148
149=item * '=end' without a target?
150
151'=end' directives need to have a target, just like =begin directives.
152
153=item * '=end I<TARGET>' is invalid.
154
155I<TARGET> needs to be one word
156
157=item * =end I<CONTENT> doesn't match =begin I<TARGET>
158
159I<CONTENT> needs to match =begin's I<TARGET>.
160
161=item * =for without a target?
162
163There is no specification of the formatter after the C<=for> command.
164
165=item * unresolved internal link I<NAME>
166
167The given link to I<NAME> does not have a matching node in the current
168POD. This also happened when a single word node name is not enclosed in
169C<"">.
170
171=item * Unknown directive: I<CMD>
172
173An invalid POD command has been found. Valid are C<=head1>, C<=head2>,
174C<=head3>, C<=head4>, C<=over>, C<=item>, C<=back>, C<=begin>, C<=end>,
175C<=for>, C<=pod>, C<=cut>
176
177=item * Deleting unknown formatting code I<SEQ>
178
179An invalid markup command has been encountered. Valid are:
180C<BE<lt>E<gt>>, C<CE<lt>E<gt>>, C<EE<lt>E<gt>>, C<FE<lt>E<gt>>,
181C<IE<lt>E<gt>>, C<LE<lt>E<gt>>, C<SE<lt>E<gt>>, C<XE<lt>E<gt>>,
182C<ZE<lt>E<gt>>
183
184=item * Unterminated I<SEQ>E<lt>E<gt> sequence
185
186An unclosed formatting code
187
188=item * An EE<lt>...E<gt> surrounding strange content
189
190The I<STRING> found cannot be interpreted as a character entity.
191
192=item * An empty EE<lt>E<gt>
193
194=item * An empty C<< LE<lt>E<gt> >>
195
196=item * An empty XE<lt>E<gt>
197
198There needs to be content inside E, L, and X formatting codes.
199
200=item * A non-empty ZE<lt>E<gt>
201
202The C<ZE<lt>E<gt>> sequence is supposed to be empty.
203
204=item * Spurious text after =pod / =cut
205
206The commands C<=pod> and C<=cut> do not take any arguments.
207
208=item * =back doesn't take any parameters, but you said =back I<ARGUMENT>
209
210The C<=back> command does not take any arguments.
211
212=item * =pod directives shouldn't be over one line long!  Ignoring all I<N> lines of content
213
214Self explanatory
215
216=item * =cut found outside a pod block.
217
218A '=cut' directive found in the middle of non-POD
219
220=item * Invalid =encoding syntax: I<CONTENT>
221
222Syntax error in =encoding directive
223
224=back
225
226=head2 Warnings
227
228These may not necessarily cause trouble, but indicate mediocre style.
229
230=over 4
231
232=item * nested commands I<CMD>E<lt>...I<CMD>E<lt>...E<gt>...E<gt>
233
234Two nested identical markup commands have been found. Generally this
235does not make sense.
236
237=item * multiple occurrences (I<N>) of link target I<name>
238
239The POD file has some C<=item> and/or C<=head> commands that have
240the same text. Potential hyperlinks to such a text cannot be unique then.
241This warning is printed only with warning level greater than one.
242
243=item * line containing nothing but whitespace in paragraph
244
245There is some whitespace on a seemingly empty line. POD is very sensitive
246to such things, so this is flagged. B<vi> users switch on the B<list>
247option to avoid this problem.
248
249=item * =item has no contents
250
251There is a list C<=item> that has no text contents. You probably want to delete
252empty items.
253
254=item * You can't have =items (as at line I<N>) unless the first thing after the =over is an =item
255
256A list introduced by C<=over> starts with a text or verbatim paragraph,
257but continues with C<=item>s. Move the non-item paragraph out of the
258C<=over>/C<=back> block.
259
260=item * Expected '=item I<EXPECTED VALUE>'
261
262=item * Expected '=item *'
263
264=item * Possible =item type mismatch: 'I<x>' found leading a supposed definition =item
265
266A list started with e.g. a bullet-like C<=item> and continued with a
267numbered one. This is obviously inconsistent. For most translators the
268type of the I<first> C<=item> determines the type of the list.
269
270=item * You have '=item x' instead of the expected '=item I<N>'
271
272Erroneous numbering of =item numbers; they need to ascend consecutively.
273
274=item * Unknown E content in EE<lt>I<CONTENT>E<gt>
275
276A character entity was found that does not belong to the standard
277ISO set or the POD specials C<verbar> and C<sol>. I<Currently, this warning
278only appears if a character entity was found that does not have a Unicode
279character. This should be fixed to adhere to the original warning.>
280
281=item * empty =over/=back block
282
283The list opened with C<=over> does not contain anything.
284
285=item * empty section in previous paragraph
286
287The previous section (introduced by a C<=head> command) does not contain
288any valid content. This usually indicates that something is missing. Note: A
289C<=head1> followed immediately by C<=head2> does not trigger this warning.
290
291=item * Verbatim paragraph in NAME section
292
293The NAME section (C<=head1 NAME>) should consist of a single paragraph
294with the script/module name, followed by a dash `-' and a very short
295description of what the thing is good for.
296
297=item * =headI<n> without preceding higher level
298
299For example if there is a C<=head2> in the POD file prior to a
300C<=head1>.
301
302=back
303
304=head2 Hyperlinks
305
306There are some warnings with respect to malformed hyperlinks:
307
308=over 4
309
310=item * ignoring leading/trailing whitespace in link
311
312There is whitespace at the beginning or the end of the contents of
313LE<lt>...E<gt>.
314
315=item * alternative text/node '%s' contains non-escaped | or /
316
317The characters C<|> and C</> are special in the LE<lt>...E<gt> context.
318Although the hyperlink parser does its best to determine which "/" is
319text and which is a delimiter in case of doubt, one ought to escape
320these literal characters like this:
321
322  /     E<sol>
323  |     E<verbar>
324
325=back
326
327Note that the line number of the error/warning may refer to the line number of
328the start of the paragraph in which the error/warning exists, not the line
329number that the error/warning is on. This bug is present in errors/warnings
330related to formatting codes. I<This should be fixed.>
331
332=head1 RETURN VALUE
333
334B<podchecker> returns the number of POD syntax errors found or -1 if
335there were no POD commands at all found in the file.
336
337=head1 EXAMPLES
338
339See L</SYNOPSIS>
340
341=head1 SCRIPTS
342
343The B<podchecker> script that comes with this distribution is a lean wrapper
344around this module. See the online manual with
345
346  podchecker -help
347  podchecker -man
348
349=head1 INTERFACE
350
351While checking, this module collects document properties, e.g. the nodes
352for hyperlinks (C<=headX>, C<=item>) and index entries (C<XE<lt>E<gt>>).
353POD translators can use this feature to syntax-check and get the nodes in
354a first pass before actually starting to convert. This is expensive in terms
355of execution time, but allows for very robust conversions.
356
357Since v1.24 the B<Pod::Checker> module uses only the B<poderror>
358method to print errors and warnings. The summary output (e.g.
359"Pod syntax OK") has been dropped from the module and has been included in
360B<podchecker> (the script). This allows users of B<Pod::Checker> to
361control completely the output behavior. Users of B<podchecker> (the script)
362get the well-known behavior.
363
364v1.45 inherits from Pod::Simple as opposed to all previous versions
365inheriting from Pod::Parser. Do B<not> use Pod::Simple's interface when
366using Pod::Checker unless it is documented somewhere on this page. I
367repeat, DO B<NOT> USE POD::SIMPLE'S INTERFACE.
368
369=cut
370
371#############################################################################
372
373#use diagnostics;
374use Carp qw(croak);
375use Exporter 'import';
376use base qw/Pod::Simple::Methody/;
377
378our @EXPORT = qw(&podchecker);
379
380##---------------------------------
381## Function definitions begin here
382##---------------------------------
383
384sub podchecker {
385    my ($infile, $outfile, %options) = @_;
386    local $_;
387
388    ## Set defaults
389    $infile  ||= \*STDIN;
390    $outfile ||= \*STDERR;
391
392    ## Now create a pod checker
393    my $checker = Pod::Checker->new(%options);
394
395    ## Now check the pod document for errors
396    $checker->parse_from_file($infile, $outfile);
397
398    ## Return the number of errors found
399    return $checker->num_errors();
400}
401
402
403##---------------------------------------------------------------------------
404
405##-------------------------------
406## Method definitions begin here
407##-------------------------------
408
409##################################
410
411=over 4
412
413=item C<Pod::Checker-E<gt>new( %options )>
414
415Return a reference to a new Pod::Checker object that inherits from
416Pod::Simple and is used for calling the required methods later. The
417following options are recognized:
418
419C<-warnings =E<gt> num>
420  Print warnings if C<num> is true. The higher the value of C<num>,
421the more warnings are printed. Currently there are only levels 1 and 2.
422
423C<-quiet =E<gt> num>
424  If C<num> is true, do not print any errors/warnings. This is useful
425when Pod::Checker is used to munge POD code into plain text from within
426POD formatters.
427
428=cut
429
430sub new {
431    my $new = shift->SUPER::new(@_);
432    $new->{'output_fh'} ||= *STDERR{IO};
433
434    # Set options
435    my %opts = @_;
436    $new->{'-warnings'} = defined $opts{'-warnings'} ?
437                                  $opts{'-warnings'} : 1; # default on
438    $new->{'-quiet'} = $opts{'-quiet'} || 0; # default off
439
440    # Initialize number of errors/warnings
441    $new->{'_NUM_ERRORS'} = 0;
442    $new->{'_NUM_WARNINGS'} = 0;
443
444    # 'current' also means 'most recent' in the follow comments
445    $new->{'_thispara'} = '';       # current POD paragraph
446    $new->{'_line'} = 0;            # current line number
447    $new->{'_head_num'} = 0;        # current =head level (set to 0 to make
448                                    #   logic easier down the road)
449    $new->{'_cmds_since_head'} = 0; # num of POD directives since prev. =headN
450    $new->{'_nodes'} = [];          # stack for =head/=item nodes
451    $new->{'_fcode_stack'} = [];    # stack for nested formatting codes
452    $new->{'_fcode_pos'} = [];      # stack for position in paragraph of fcodes
453    $new->{'_begin_stack'} = [];    # stack for =begins: [line #, target]
454    $new->{'_links'} = [];          # stack for hyperlinks to external entities
455    $new->{'_internal_links'} = []; # set of linked-to internal sections
456    $new->{'_index'} = [];          # stack for text in X<>s
457
458    $new->accept_targets('*'); # check all =begin/=for blocks
459    $new->cut_handler( \&handle_pod_and_cut ); # warn if text after =cut
460    $new->pod_handler( \&handle_pod_and_cut ); # warn if text after =pod
461    $new->whiteline_handler( \&handle_whiteline ); # warn if whiteline
462    $new->parse_empty_lists(1); # warn if they are empty
463
464    return $new;
465}
466
467##################################
468
469=item C<$checker-E<gt>poderror( @args )>
470
471=item C<$checker-E<gt>poderror( {%opts}, @args )>
472
473Internal method for printing errors and warnings. If no options are given,
474simply prints "@_". The following options are recognized and used to form
475the output:
476
477  -msg
478
479A message to print prior to C<@args>.
480
481  -line
482
483The line number the error occurred in.
484
485  -file
486
487The file (name) the error occurred in. Defaults to the name of the current
488file being processed.
489
490  -severity
491
492The error level, should be 'WARNING' or 'ERROR'.
493
494=cut
495
496# Invoked as $self->poderror( @args ), or $self->poderror( {%opts}, @args )
497sub poderror {
498    my $self = shift;
499    my %opts = (ref $_[0]) ? %{shift()} : ();
500
501    ## Retrieve options
502    chomp( my $msg  = ($opts{'-msg'} || '')."@_" );
503    my $line = (exists $opts{'-line'}) ? " at line $opts{'-line'}" : '';
504    my $file = ' in file ' . ((exists $opts{'-file'})
505                              ? $opts{'-file'}
506                              : ((defined $self->source_filename)
507                                 ? $self->source_filename
508                                 : "???"));
509    unless (exists $opts{'-severity'}) {
510       ## See if can find severity in message prefix
511       $opts{'-severity'} = $1  if ( $msg =~ s/^\**\s*([A-Z]{3,}):\s+// );
512    }
513    my $severity = (exists $opts{'-severity'}) ? "*** $opts{-severity}: " : '';
514
515    ## Increment error count and print message "
516    ++($self->{'_NUM_ERRORS'})
517        if(!%opts || ($opts{-severity} && $opts{'-severity'} eq 'ERROR'));
518    ++($self->{'_NUM_WARNINGS'})
519        if(!%opts || ($opts{-severity} && $opts{'-severity'} eq 'WARNING'));
520    unless($self->{'-quiet'}) {
521      my $out_fh = $self->{'output_fh'} || \*STDERR;
522      print $out_fh ($severity, $msg, $line, $file, "\n")
523        if($self->{'-warnings'} || !%opts || $opts{'-severity'} ne 'WARNING');
524    }
525}
526
527##################################
528
529=item C<$checker-E<gt>num_errors()>
530
531Set (if argument specified) and retrieve the number of errors found.
532
533=cut
534
535sub num_errors {
536   return (@_ > 1) ? ($_[0]->{'_NUM_ERRORS'} = $_[1]) : $_[0]->{'_NUM_ERRORS'};
537}
538
539##################################
540
541=item C<$checker-E<gt>num_warnings()>
542
543Set (if argument specified) and retrieve the number of warnings found.
544
545=cut
546
547sub num_warnings {
548   return (@_ > 1) ? ($_[0]->{'_NUM_WARNINGS'} = $_[1]) :
549                      $_[0]->{'_NUM_WARNINGS'};
550}
551
552##################################
553
554=item C<$checker-E<gt>name()>
555
556Set (if argument specified) and retrieve the canonical name of POD as
557found in the C<=head1 NAME> section.
558
559=cut
560
561sub name {
562    return (@_ > 1 && $_[1]) ?
563        ($_[0]->{'_pod_name'} = $_[1]) : $_[0]->{'_pod_name'};
564}
565
566##################################
567
568=item C<$checker-E<gt>node()>
569
570Add (if argument specified) and retrieve the nodes (as defined by C<=headX>
571and C<=item>) of the current POD. The nodes are returned in the order of
572their occurrence. They consist of plain text, each piece of whitespace is
573collapsed to a single blank.
574
575=cut
576
577sub node {
578    my ($self,$text) = @_;
579    if(defined $text) {
580        $text =~ s/\s+$//s; # strip trailing whitespace
581        $text =~ s/\s+/ /gs; # collapse whitespace
582        # add node, order important!
583        push(@{$self->{'_nodes'}}, $text);
584        # keep also a uniqueness counter
585        $self->{'_unique_nodes'}->{$text}++ if($text !~ /^\s*$/s);
586        return $text;
587    }
588    @{$self->{'_nodes'}};
589}
590
591##################################
592
593=item C<$checker-E<gt>idx()>
594
595Add (if argument specified) and retrieve the index entries (as defined by
596C<XE<lt>E<gt>>) of the current POD. They consist of plain text, each piece
597of whitespace is collapsed to a single blank.
598
599=cut
600
601# set/return index entries of current POD
602sub idx {
603    my ($self,$text) = @_;
604    if(defined $text) {
605        $text =~ s/\s+$//s; # strip trailing whitespace
606        $text =~ s/\s+/ /gs; # collapse whitespace
607        # add node, order important!
608        push(@{$self->{'_index'}}, $text);
609        # keep also a uniqueness counter
610        $self->{'_unique_nodes'}->{$text}++ if($text !~ /^\s*$/s);
611        return $text;
612    }
613    @{$self->{'_index'}};
614}
615
616##################################
617
618# add a hyperlink to the list of those of the current POD; returns current
619# list after the addition has been done
620sub hyperlink {
621    my $self = shift;
622    push(@{$self->{'_links'}}, $_[0]);
623    return $_[0];
624}
625
626=item C<$checker-E<gt>hyperlinks()>
627
628Retrieve an array containing the hyperlinks to things outside
629the current POD (as defined by C<LE<lt>E<gt>>).
630
631Each is an instance of a class with the following methods:
632
633=cut
634
635sub hyperlinks {
636    @{shift->{'_links'}};
637}
638
639##################################
640
641# override Pod::Simple's whine() and scream() to use poderror()
642
643# Note:
644# Ignore $self->{'no_whining'} b/c $self->{'quiet'} takes care of it in poderror
645# Don't bother incrementing $self->{'errors_seen'} -- it's not used
646# Don't bother pushing to $self->{'errata'} b/c poderror() outputs immediately
647# We don't need to set $self->no_errata_section(1) b/c of these overrides
648
649
650sub whine {
651    my ($self, $line, $complaint) = @_;
652
653    my $severity = 'ERROR';
654
655    if (0) {
656      # XXX: Let's standardize what's a warning and what's an error.  Let's not
657      # move stuff up and down the severity tree.  -- rjbs, 2013-04-12
658      # Convert errors in Pod::Simple that are warnings in Pod::Checker
659      # XXX Do differently so the $complaint can be reworded without this breaking
660      $severity = 'WARNING' if
661          $complaint =~ /^Expected '=item .+?'$/ ||
662          $complaint =~ /^You can't have =items \(as at line .+?\) unless the first thing after the =over is an =item$/ ||
663          $complaint =~ /^You have '=item .+?' instead of the expected '=item .+?'$/;
664    }
665
666    $self->poderror({ -line => $line,
667                      -severity => $severity,
668                      -msg => $complaint });
669
670    return 1; # assume everything is peachy keen
671}
672
673sub scream {
674    my ($self, $line, $complaint) = @_;
675
676    $self->poderror({ -line => $line,
677                      -severity => 'ERROR', # consider making severity 'FATAL'
678                      -msg => $complaint });
679
680    return 1;
681}
682
683
684##################################
685
686# Some helper subroutines
687
688sub _init_event { # assignments done at the start of most events
689    $_[0]{'_thispara'} = '';
690    $_[0]{'_line'} = $_[1]{'start_line'};
691    $_[0]{'_cmds_since_head'}++;
692}
693
694sub _check_fcode {
695    my ($self, $inner, $outers) = @_;
696    # Check for an fcode inside another of the same fcode
697    # XXX line number is the line of the start of the paragraph that the warning
698    # is in, not the line that the warning is on. Fix this
699
700    # Later versions of Pod::Simple forbid nested L<>'s
701    return if $inner eq 'L' && $Pod::Simple::VERSION ge '3.33';
702
703    if (grep { $_ eq $inner } @$outers) {
704        $self->poderror({ -line => $self->{'_line'},
705                          -severity => 'WARNING',
706                          -msg => "nested commands $inner<...$inner<...>...>"});
707    }
708}
709
710##################################
711
712sub handle_text { $_[0]{'_thispara'} .= $_[1] }
713
714# whiteline is a seemingly blank line that matches /[^\S\r\n]/
715sub handle_whiteline {
716    my ($line, $line_n, $self) = @_;
717    $self->poderror({
718        -line => $line_n,
719        -severity => 'WARNING',
720        -msg => 'line containing nothing but whitespace in paragraph'});
721}
722
723######## Directives
724sub handle_pod_and_cut {
725    my ($line, $line_n, $self) = @_;
726    $self->{'_cmds_since_head'}++;
727    if ($line =~ /=(pod|cut)\s+\S/) {
728        $self->poderror({ -line => $line_n,
729                          -severity => 'ERROR',
730                          -msg => "Spurious text after =$1"});
731    }
732}
733
734sub start_Para { shift->_init_event(@_); }
735sub end_Para   {
736    my $self = shift;
737    # Get the NAME of the pod document
738    if ($self->{'_head_num'} == 1 && $self->{'_head_text'} eq 'NAME') {
739        if ($self->{'_thispara'} =~ /^\s*(\S+?)\s*[,-]/) {
740            $self->{'_pod_name'} = $1 unless defined $self->{'_pod_name'};
741        }
742    }
743}
744
745sub start_Verbatim {
746    my $self = shift;
747    $self->_init_event(@_);
748
749    if ($self->{'_head_num'} == 1 && $self->{'_head_text'} eq 'NAME') {
750        $self->poderror({ -line => $self->{'_line'},
751                          -severity => 'WARNING',
752                          -msg => 'Verbatim paragraph in NAME section' });
753    }
754}
755# Don't need an end_Verbatim
756
757# Do I need to do anything else with this?
758sub start_Data { shift->_init_event() }
759
760sub start_head1 { shift->start_head(1, @_) }
761sub start_head2 { shift->start_head(2, @_) }
762sub start_head3 { shift->start_head(3, @_) }
763sub start_head4 { shift->start_head(4, @_) }
764sub start_head  {
765    my $self = shift;
766    my $h = shift;
767    $self->_init_event(@_);
768    my $prev_h = $self->{'_head_num'};
769    $self->{'_head_num'} = $h;
770    $self->{"_count_head$h"}++;
771
772    if ($h > 1 && !$self->{'_count_head'.($h-1)}) {
773        $self->poderror({ -line => $self->{'_line'},
774                          -severity => 'WARNING',
775                          -msg => "=head$h without preceding higher level"});
776    }
777
778    # If this is the first =head of the doc, $prev_h is 0, thus less than $h
779    if ($self->{'_cmds_since_head'} == 1 && $prev_h >= $h) {
780        $self->poderror({ -line => $self->{'_line'},
781                          -severity => 'WARNING',
782                          -msg => 'empty section in previous paragraph'});
783    }
784}
785
786sub end_head1 { shift->end_head(@_) }
787sub end_head2 { shift->end_head(@_) }
788sub end_head3 { shift->end_head(@_) }
789sub end_head4 { shift->end_head(@_) }
790sub end_head  {
791    my $self = shift;
792    my $arg = $self->{'_thispara'};
793    $arg =~ s/\s+$//;
794    $self->{'_head_text'} = $arg;
795    $self->{'_cmds_since_head'} = 0;
796    my $h = $self->{'_head_num'};
797    $self->node($arg); # remember this node
798    if ($arg eq '') {
799        $self->poderror({ -line => $self->{'_line'},
800                          -severity => 'ERROR',
801                          -msg => "empty =head$h" });
802    }
803}
804
805sub start_over_bullet { shift->start_over(@_, 'bullet') }
806sub start_over_number { shift->start_over(@_, 'number') }
807sub start_over_text   { shift->start_over(@_, 'definition') }
808sub start_over_block  { shift->start_over(@_, 'block') }
809sub start_over_empty  {
810    my $self = shift;
811    $self->start_over(@_, 'empty');
812    $self->poderror({ -line => $self->{'_line'},
813                      -severity => 'WARNING',
814                      -msg => 'empty =over/=back block' });
815}
816sub start_over {
817    my $self = shift;
818    my $type = pop;
819    $self->_init_event(@_);
820}
821
822sub start_item_bullet { shift->_init_event(@_) }
823sub start_item_number { shift->_init_event(@_) }
824sub start_item_text   { shift->_init_event(@_) }
825sub end_item_bullet { shift->end_item('bullet') }
826sub end_item_number { shift->end_item('number') }
827sub end_item_text   { shift->end_item('definition') }
828sub end_item {
829    my $self = shift;
830    my $type = shift;
831    # If there is verbatim text in this item, it will show up as part of
832    # 'paras', and not part of '_thispara'.  If the first para after this is a
833    # verbatim one, it actually will be (part of) the contents for this item.
834    if (   $self->{'_thispara'} eq ''
835        && (  ! @{$self->{'paras'}}
836            ||    $self->{'paras'}[0][0] !~ /Verbatim/i))
837    {
838        $self->poderror({ -line => $self->{'_line'},
839                          -severity => 'WARNING',
840                          -msg => '=item has no contents' });
841    }
842
843    $self->node($self->{'_thispara'}); # remember this node
844}
845
846sub start_for { # =for and =begin directives
847    my ($self, $flags) = @_;
848    $self->_init_event($flags);
849    push @{$self->{'_begin_stack'}}, [$self->{'_line'}, $flags->{'target'}];
850}
851
852sub end_for {
853    my ($self, $flags) = @_;
854    my ($line, $target) = @{pop @{$self->{'_begin_stack'}}};
855    if ($flags->{'fake-closer'}) { # meaning Pod::Simple generated this =end
856        $self->poderror({ -line => $line,
857                          -severity => 'ERROR',
858                          -msg => "=begin $target without matching =end $target"
859                        });
860    }
861}
862
863sub end_Document {
864    # Some final error checks
865    my $self = shift;
866
867    # no POD found here
868    $self->num_errors(-1) && return unless $self->content_seen;
869
870    my %nodes;
871    for ($self->node()) {
872        $nodes{$_} = 1;
873        if(/^(\S+)\s+\S/) {
874            # we have more than one word. Use the first as a node, too.
875            # This is used heavily in perlfunc.pod
876            $nodes{$1} ||= 2; # derived node
877        }
878    }
879    for ($self->idx()) {
880        $nodes{$_} = 3; # index node
881    }
882
883    # XXX update unresolved internal link POD -- single word not enclosed in ""?
884    # I don't know what I was thinking when I made the above TODO, and I don't
885    # know what it means...
886
887    for my $link (@{ $self->{'_internal_links'} }) {
888        my ($name, $line) = @$link;
889        unless ( $nodes{$name} ) {
890            $self->poderror({ -line => $line,
891                              -severity => 'ERROR',
892                              -msg => "unresolved internal link '$name'"});
893        }
894    }
895
896    # check the internal nodes for uniqueness. This pertains to
897    # =headX, =item and X<...>
898    if ($self->{'-warnings'} > 1 ) {
899        for my $node (sort keys %{ $self->{'_unique_nodes'} }) {
900            my $count = $self->{'_unique_nodes'}{$node};
901            if ($count > 1) { # not unique
902                $self->poderror({
903                    -line => '-',
904                    -severity => 'WARNING',
905                    -msg => "multiple occurrences ($count) of link target ".
906                        "'$node'"});
907            }
908        }
909    }
910}
911
912########  Formatting codes
913
914sub start_B { shift->start_fcode('B') }
915sub start_C { shift->start_fcode('C') }
916sub start_F { shift->start_fcode('F') }
917sub start_I { shift->start_fcode('I') }
918sub start_S { shift->start_fcode('S') }
919sub start_fcode {
920    my ($self, $fcode) = @_;
921    unshift @{$self->{'_fcode_stack'}}, $fcode;
922}
923
924sub end_B { shift->end_fcode() }
925sub end_C { shift->end_fcode() }
926sub end_F { shift->end_fcode() }
927sub end_I { shift->end_fcode() }
928sub end_S { shift->end_fcode() }
929sub end_fcode {
930    my $self = shift;
931    $self->_check_fcode(shift @{$self->{'_fcode_stack'}}, # current fcode removed
932                        $self->{'_fcode_stack'}); # previous fcodes
933}
934
935sub start_L {
936    my ($self, $flags) = @_;
937    $self->start_fcode('L');
938
939    my $link = Pod::Checker::Hyperlink->new($flags, $self);
940    if ($link) {
941        if (   $link->type eq 'pod'
942            && $link->node
943                # It's an internal-to-this-page link if no page is given, or
944                # if the given one is to our NAME.
945            && (! $link->page || (   $self->{'_pod_name'}
946                                  && $link->page eq $self->{'_pod_name'})))
947        {
948            push @{ $self->{'_internal_links'} }, [ $link->{'-raw_node'}, $link->line ];
949        }
950        else {
951            $self->hyperlink($link);
952        }
953    }
954}
955
956sub end_L {
957    my $self = shift;
958    $self->end_fcode();
959}
960
961sub start_X {
962    my $self = shift;
963    $self->start_fcode('X');
964    # keep track of where X<> starts in the paragraph
965    # (this is a stack so nested X<>s are handled correctly)
966    push @{$self->{'_fcode_pos'}}, length $self->{'_thispara'};
967}
968sub end_X {
969    my $self = shift;
970    # extract contents of X<> and replace with ''
971    my $start = pop @{$self->{'_fcode_pos'}}; # start at the beginning of X<>
972    my $end = length($self->{'_thispara'}) - $start; # end at end of X<>
973    my $x = substr($self->{'_thispara'}, $start, $end, '');
974    if ($x eq "") {
975        $self->poderror({ -line => $self->{'_line'},
976                          -severity => 'ERROR',
977                          -msg => "An empty X<>" });
978    }
979    $self->idx($x); # remember this node
980    $self->end_fcode();
981}
982
983package Pod::Checker::Hyperlink;
984
985# This class is used to represent L<> link structures, so that the individual
986# elements are easily accessible.  It is based on code in Pod::Hyperlink
987
988sub new {
989    my ($class,
990        $simple_link,   # The link structure returned by Pod::Simple
991        $caller         # The caller class
992    ) = @_;
993
994    my $self = +{};
995    bless $self, $class;
996
997    $self->{'-line'} ||= $caller->{'_line'};
998    $self->{'-type'} ||= $simple_link->{'type'};
999
1000    # Force stringification of page and node.  (This expands any E<>.)
1001    $self->{'-page'} = exists $simple_link->{'to'} ? "$simple_link->{'to'}" : "";
1002    $self->{'-node'} = exists $simple_link->{'section'} ? "$simple_link->{'section'}" : "";
1003
1004    # Save the unmodified node text, as the .t files are expecting the message
1005    # for internal link failures to include it (hence this preserves backward
1006    # compatibility).
1007    $self->{'-raw_node'} = $self->{'-node'};
1008
1009    # Remove leading/trailing white space.  Pod::Simple already warns about
1010    # these, so if the only error is this, and the link is otherwise correct,
1011    # only the Pod::Simple warning will be output, avoiding unnecessary
1012    # confusion.
1013    $self->{'-page'} =~ s/ ^ \s+ //x;
1014    $self->{'-page'} =~ s/ \s+ $ //x;
1015
1016    $self->{'-node'} =~ s/ ^ \s+ //x;
1017    $self->{'-node'} =~ s/ \s+ $ //x;
1018
1019    # Pod::Simple warns about L<> and L< >, but not L</>
1020    if ($self->{'-page'} eq "" && $self->{'-node'} eq "") {
1021        $caller->poderror({ -line => $caller->{'_line'},
1022                          -severity => 'WARNING',
1023                          -msg => 'empty link'});
1024        return;
1025    }
1026
1027    return $self;
1028}
1029
1030=item line()
1031
1032Returns the approximate line number in which the link was encountered
1033
1034=cut
1035
1036sub line {
1037    return $_[0]->{-line};
1038}
1039
1040=item type()
1041
1042Returns the type of the link; one of:
1043C<"url"> for things like
1044C<http://www.foo>, C<"man"> for man pages, or C<"pod">.
1045
1046=cut
1047
1048sub type {
1049    return  $_[0]->{-type};
1050}
1051
1052=item page()
1053
1054Returns the linked-to page or url.
1055
1056=cut
1057
1058sub page {
1059    return $_[0]->{-page};
1060}
1061
1062=item node()
1063
1064Returns the anchor or node within the linked-to page, or an empty string
1065(C<"">) if none appears in the link.
1066
1067=back
1068
1069=cut
1070
1071sub node {
1072    return $_[0]->{-node};
1073}
1074
1075=head1 AUTHOR
1076
1077Please report bugs using L<http://rt.cpan.org>.
1078
1079Brad Appleton E<lt>bradapp@enteract.comE<gt> (initial version),
1080Marek Rouchal E<lt>marekr@cpan.orgE<gt>,
1081Marc Green E<lt>marcgreen@cpan.orgE<gt> (port to Pod::Simple)
1082Ricardo Signes E<lt>rjbs@cpan.orgE<gt> (more porting to Pod::Simple)
1083Karl Williamson E<lt>khw@cpan.orgE<gt> (more porting to Pod::Simple)
1084
1085Based on code for B<Pod::Text::pod2text()> written by
1086Tom Christiansen E<lt>tchrist@mox.perl.comE<gt>
1087
1088=cut
1089
10901
1091