1#############################################################################
2# Pod/Checker.pm -- check pod documents for syntax errors
3#
4# Copyright (C) 1994-2000 by Bradford Appleton. All rights reserved.
5# This is free software; you can redistribute it and/or modify it under the
6# same terms as Perl itself.
7#############################################################################
8
9package Pod::Checker;
10use strict;
11use warnings;
12
13our $VERSION = '1.74';  ## Current version of this package
14
15=head1 NAME
16
17Pod::Checker - check pod documents for syntax errors
18
19=head1 SYNOPSIS
20
21  use Pod::Checker;
22
23  $syntax_okay = podchecker($filepath, $outputpath, %options);
24
25  my $checker = Pod::Checker->new(%options);
26  $checker->parse_from_file($filepath, \*STDERR);
27
28=head1 OPTIONS/ARGUMENTS
29
30C<$filepath> is the input POD to read and C<$outputpath> is
31where to write POD syntax error messages. Either argument may be a scalar
32indicating a file-path, or else a reference to an open filehandle.
33If unspecified, the input-file it defaults to C<\*STDIN>, and
34the output-file defaults to C<\*STDERR>.
35
36=head2 podchecker()
37
38This function can take a hash of options:
39
40=over 4
41
42=item B<-warnings> =E<gt> I<val>
43
44Turn warnings on/off. I<val> is usually 1 for on, but higher values
45trigger additional warnings. See L<"Warnings">.
46
47=item B<-quiet> =E<gt> I<val>
48
49If C<val> is true, do not print any errors/warnings.
50
51=back
52
53=head1 DESCRIPTION
54
55B<podchecker> will perform syntax checking of Perl5 POD format documentation.
56
57Curious/ambitious users are welcome to propose additional features they wish
58to see in B<Pod::Checker> and B<podchecker> and verify that the checks are
59consistent with L<perlpod>.
60
61The following checks are currently performed:
62
63=over 4
64
65=item *
66
67Unknown '=xxxx' commands, unknown 'XE<lt>...E<gt>' interior-sequences,
68and unterminated interior sequences.
69
70=item *
71
72Check for proper balancing of C<=begin> and C<=end>. The contents of such
73a block are generally ignored, i.e. no syntax checks are performed.
74
75=item *
76
77Check for proper nesting and balancing of C<=over>, C<=item> and C<=back>.
78
79=item *
80
81Check for same nested interior-sequences (e.g.
82C<LE<lt>...LE<lt>...E<gt>...E<gt>>).
83
84=item *
85
86Check for malformed or non-existing entities C<EE<lt>...E<gt>>.
87
88=item *
89
90Check for correct syntax of hyperlinks C<LE<lt>...E<gt>>. See L<perlpod>
91for details.
92
93=item *
94
95Check for unresolved document-internal links. This check may also reveal
96misspelled links that seem to be internal links but should be links
97to something else.
98
99=back
100
101=head1 DIAGNOSTICS
102
103=head2 Errors
104
105=over 4
106
107=item * empty =headn
108
109A heading (C<=head1> or C<=head2>) without any text? That ain't no
110heading!
111
112=item * =over on line I<N> without closing =back
113
114=item * You forgot a '=back' before '=headI<N>'
115
116=item * =over is the last thing in the document?!
117
118The C<=over> command does not have a corresponding C<=back> before the
119next heading (C<=head1> or C<=head2>) or the end of the file.
120
121=item * '=item' outside of any '=over'
122
123=item * =back without =over
124
125An C<=item> or C<=back> command has been found outside a
126C<=over>/C<=back> block.
127
128=item * Can't have a 0 in =over I<N>
129
130You need to indent a strictly positive number of spaces, not 0.
131
132=item * =over should be: '=over' or '=over positive_number'
133
134Either have an argumentless =over, or have its argument a strictly positive number.
135
136=item * =begin I<TARGET> without matching =end I<TARGET>
137
138A C<=begin> command was found that has no matching =end command.
139
140=item * =begin without a target?
141
142A C<=begin> command was found that is not followed by the formatter
143specification.
144
145=item * =end I<TARGET> without matching =begin.
146
147A standalone C<=end> command was found.
148
149=item * '=end' without a target?
150
151'=end' directives need to have a target, just like =begin directives.
152
153=item * '=end I<TARGET>' is invalid.
154
155I<TARGET> needs to be one word
156
157=item * =end I<CONTENT> doesn't match =begin I<TARGET>
158
159I<CONTENT> needs to match =begin's I<TARGET>.
160
161=item * =for without a target?
162
163There is no specification of the formatter after the C<=for> command.
164
165=item * unresolved internal link I<NAME>
166
167The given link to I<NAME> does not have a matching node in the current
168POD. This also happened when a single word node name is not enclosed in
169C<"">.
170
171=item * Unknown directive: I<CMD>
172
173An invalid POD command has been found. Valid are C<=head1>, C<=head2>,
174C<=head3>, C<=head4>, C<=over>, C<=item>, C<=back>, C<=begin>, C<=end>,
175C<=for>, C<=pod>, C<=cut>
176
177=item * Deleting unknown formatting code I<SEQ>
178
179An invalid markup command has been encountered. Valid are:
180C<BE<lt>E<gt>>, C<CE<lt>E<gt>>, C<EE<lt>E<gt>>, C<FE<lt>E<gt>>,
181C<IE<lt>E<gt>>, C<LE<lt>E<gt>>, C<SE<lt>E<gt>>, C<XE<lt>E<gt>>,
182C<ZE<lt>E<gt>>
183
184=item * Unterminated I<SEQ>E<lt>E<gt> sequence
185
186An unclosed formatting code
187
188=item * An EE<lt>...E<gt> surrounding strange content
189
190The I<STRING> found cannot be interpreted as a character entity.
191
192=item * An empty EE<lt>E<gt>
193
194=item * An empty C<< LE<lt>E<gt> >>
195
196=item * An empty XE<lt>E<gt>
197
198There needs to be content inside E, L, and X formatting codes.
199
200=item * Spurious text after =pod / =cut
201
202The commands C<=pod> and C<=cut> do not take any arguments.
203
204=item * =back doesn't take any parameters, but you said =back I<ARGUMENT>
205
206The C<=back> command does not take any arguments.
207
208=item * =pod directives shouldn't be over one line long!  Ignoring all I<N> lines of content
209
210Self explanatory
211
212=item * =cut found outside a pod block.
213
214A '=cut' directive found in the middle of non-POD
215
216=item * Invalid =encoding syntax: I<CONTENT>
217
218Syntax error in =encoding directive
219
220=back
221
222=head2 Warnings
223
224These may not necessarily cause trouble, but indicate mediocre style.
225
226=over 4
227
228=item * nested commands I<CMD>E<lt>...I<CMD>E<lt>...E<gt>...E<gt>
229
230Two nested identical markup commands have been found. Generally this
231does not make sense.
232
233=item * multiple occurrences (I<N>) of link target I<name>
234
235The POD file has some C<=item> and/or C<=head> commands that have
236the same text. Potential hyperlinks to such a text cannot be unique then.
237This warning is printed only with warning level greater than one.
238
239=item * line containing nothing but whitespace in paragraph
240
241There is some whitespace on a seemingly empty line. POD is very sensitive
242to such things, so this is flagged. B<vi> users switch on the B<list>
243option to avoid this problem.
244
245=item * =item has no contents
246
247There is a list C<=item> that has no text contents. You probably want to delete
248empty items.
249
250=item * You can't have =items (as at line I<N>) unless the first thing after the =over is an =item
251
252A list introduced by C<=over> starts with a text or verbatim paragraph,
253but continues with C<=item>s. Move the non-item paragraph out of the
254C<=over>/C<=back> block.
255
256=item * Expected '=item I<EXPECTED VALUE>'
257
258=item * Expected '=item *'
259
260=item * Possible =item type mismatch: 'I<x>' found leading a supposed definition =item
261
262A list started with e.g. a bullet-like C<=item> and continued with a
263numbered one. This is obviously inconsistent. For most translators the
264type of the I<first> C<=item> determines the type of the list.
265
266=item * You have '=item x' instead of the expected '=item I<N>'
267
268Erroneous numbering of =item numbers; they need to ascend consecutively.
269
270=item * Unknown E content in EE<lt>I<CONTENT>E<gt>
271
272A character entity was found that does not belong to the standard
273ISO set or the POD specials C<verbar> and C<sol>. I<Currently, this warning
274only appears if a character entity was found that does not have a Unicode
275character. This should be fixed to adhere to the original warning.>
276
277=item * empty =over/=back block
278
279The list opened with C<=over> does not contain anything.
280
281=item * empty section in previous paragraph
282
283The previous section (introduced by a C<=head> command) does not contain
284any valid content. This usually indicates that something is missing. Note: A
285C<=head1> followed immediately by C<=head2> does not trigger this warning.
286
287=item * Verbatim paragraph in NAME section
288
289The NAME section (C<=head1 NAME>) should consist of a single paragraph
290with the script/module name, followed by a dash `-' and a very short
291description of what the thing is good for.
292
293=item * =headI<n> without preceding higher level
294
295For example if there is a C<=head2> in the POD file prior to a
296C<=head1>.
297
298=item * A non-empty ZE<lt>E<gt>
299
300The C<ZE<lt>E<gt>> sequence is supposed to be empty. Caveat: this issue is
301detected in L<Pod::Simple> and will be flagged as an I<ERROR> by any client
302code; any contents of C<ZE<lt>...E<gt>> will be disregarded, anyway.
303
304=back
305
306=head2 Hyperlinks
307
308There are some warnings with respect to malformed hyperlinks:
309
310=over 4
311
312=item * ignoring leading/trailing whitespace in link
313
314There is whitespace at the beginning or the end of the contents of
315LE<lt>...E<gt>.
316
317=item * alternative text/node '%s' contains non-escaped | or /
318
319The characters C<|> and C</> are special in the LE<lt>...E<gt> context.
320Although the hyperlink parser does its best to determine which "/" is
321text and which is a delimiter in case of doubt, one ought to escape
322these literal characters like this:
323
324  /     E<sol>
325  |     E<verbar>
326
327=back
328
329Note that the line number of the error/warning may refer to the line number of
330the start of the paragraph in which the error/warning exists, not the line
331number that the error/warning is on. This bug is present in errors/warnings
332related to formatting codes. I<This should be fixed.>
333
334=head1 RETURN VALUE
335
336B<podchecker> returns the number of POD syntax errors found or -1 if
337there were no POD commands at all found in the file.
338
339=head1 EXAMPLES
340
341See L</SYNOPSIS>
342
343=head1 SCRIPTS
344
345The B<podchecker> script that comes with this distribution is a lean wrapper
346around this module. See the online manual with
347
348  podchecker -help
349  podchecker -man
350
351=head1 INTERFACE
352
353While checking, this module collects document properties, e.g. the nodes
354for hyperlinks (C<=headX>, C<=item>) and index entries (C<XE<lt>E<gt>>).
355POD translators can use this feature to syntax-check and get the nodes in
356a first pass before actually starting to convert. This is expensive in terms
357of execution time, but allows for very robust conversions.
358
359Since v1.24 the B<Pod::Checker> module uses only the B<poderror>
360method to print errors and warnings. The summary output (e.g.
361"Pod syntax OK") has been dropped from the module and has been included in
362B<podchecker> (the script). This allows users of B<Pod::Checker> to
363control completely the output behavior. Users of B<podchecker> (the script)
364get the well-known behavior.
365
366v1.45 inherits from L<Pod::Simple> as opposed to all previous versions
367inheriting from Pod::Parser. Do B<not> use Pod::Simple's interface when
368using Pod::Checker unless it is documented somewhere on this page. I
369repeat, DO B<NOT> USE POD::SIMPLE'S INTERFACE.
370
371The following list documents the overrides to Pod::Simple, primarily to
372make L<Pod::Coverage> happy:
373
374=over 4
375
376=item end_B
377
378=item end_C
379
380=item end_Document
381
382=item end_F
383
384=item end_I
385
386=item end_L
387
388=item end_Para
389
390=item end_S
391
392=item end_X
393
394=item end_fcode
395
396=item end_for
397
398=item end_head
399
400=item end_head1
401
402=item end_head2
403
404=item end_head3
405
406=item end_head4
407
408=item end_item
409
410=item end_item_bullet
411
412=item end_item_number
413
414=item end_item_text
415
416=item handle_pod_and_cut
417
418=item handle_text
419
420=item handle_whiteline
421
422=item hyperlink
423
424=item scream
425
426=item start_B
427
428=item start_C
429
430=item start_Data
431
432=item start_F
433
434=item start_I
435
436=item start_L
437
438=item start_Para
439
440=item start_S
441
442=item start_Verbatim
443
444=item start_X
445
446=item start_fcode
447
448=item start_for
449
450=item start_head
451
452=item start_head1
453
454=item start_head2
455
456=item start_head3
457
458=item start_head4
459
460=item start_item_bullet
461
462=item start_item_number
463
464=item start_item_text
465
466=item start_over
467
468=item start_over_block
469
470=item start_over_bullet
471
472=item start_over_empty
473
474=item start_over_number
475
476=item start_over_text
477
478=item whine
479
480=back
481
482=cut
483
484#############################################################################
485
486#use diagnostics;
487use Carp qw(croak);
488use Exporter 'import';
489use base qw/Pod::Simple::Methody/;
490
491our @EXPORT = qw(&podchecker);
492
493##---------------------------------
494## Function definitions begin here
495##---------------------------------
496
497sub podchecker {
498    my ($infile, $outfile, %options) = @_;
499    local $_;
500
501    ## Set defaults
502    $infile  ||= \*STDIN;
503    $outfile ||= \*STDERR;
504
505    ## Now create a pod checker
506    my $checker = Pod::Checker->new(%options);
507
508    ## Now check the pod document for errors
509    $checker->parse_from_file($infile, $outfile);
510
511    ## Return the number of errors found
512    return $checker->num_errors();
513}
514
515
516##---------------------------------------------------------------------------
517
518##-------------------------------
519## Method definitions begin here
520##-------------------------------
521
522##################################
523
524=over 4
525
526=item C<Pod::Checker-E<gt>new( %options )>
527
528Return a reference to a new Pod::Checker object that inherits from
529Pod::Simple and is used for calling the required methods later. The
530following options are recognized:
531
532C<-warnings =E<gt> num>
533  Print warnings if C<num> is true. The higher the value of C<num>,
534the more warnings are printed. Currently there are only levels 1 and 2.
535
536C<-quiet =E<gt> num>
537  If C<num> is true, do not print any errors/warnings. This is useful
538when Pod::Checker is used to munge POD code into plain text from within
539POD formatters.
540
541=cut
542
543sub new {
544    my $new = shift->SUPER::new(@_);
545    $new->{'output_fh'} ||= *STDERR{IO};
546
547    # Set options
548    my %opts = @_;
549    $new->{'-warnings'} = defined $opts{'-warnings'} ?
550                                  $opts{'-warnings'} : 1; # default on
551    $new->{'-quiet'} = $opts{'-quiet'} || 0; # default off
552
553    # Initialize number of errors/warnings
554    $new->{'_NUM_ERRORS'} = 0;
555    $new->{'_NUM_WARNINGS'} = 0;
556
557    # 'current' also means 'most recent' in the follow comments
558    $new->{'_thispara'} = '';       # current POD paragraph
559    $new->{'_line'} = 0;            # current line number
560    $new->{'_head_num'} = 0;        # current =head level (set to 0 to make
561                                    #   logic easier down the road)
562    $new->{'_cmds_since_head'} = 0; # num of POD directives since prev. =headN
563    $new->{'_nodes'} = [];          # stack for =head/=item nodes
564    $new->{'_fcode_stack'} = [];    # stack for nested formatting codes
565    $new->{'_fcode_pos'} = [];      # stack for position in paragraph of fcodes
566    $new->{'_begin_stack'} = [];    # stack for =begins: [line #, target]
567    $new->{'_links'} = [];          # stack for hyperlinks to external entities
568    $new->{'_internal_links'} = []; # set of linked-to internal sections
569    $new->{'_index'} = [];          # stack for text in X<>s
570
571    $new->accept_targets('*'); # check all =begin/=for blocks
572    $new->cut_handler( \&handle_pod_and_cut ); # warn if text after =cut
573    $new->pod_handler( \&handle_pod_and_cut ); # warn if text after =pod
574    $new->whiteline_handler( \&handle_whiteline ); # warn if whiteline
575    $new->parse_empty_lists(1); # warn if they are empty
576
577    return $new;
578}
579
580##################################
581
582=item C<$checker-E<gt>poderror( @args )>
583
584=item C<$checker-E<gt>poderror( {%opts}, @args )>
585
586Internal method for printing errors and warnings. If no options are given,
587simply prints "@_". The following options are recognized and used to form
588the output:
589
590  -msg
591
592A message to print prior to C<@args>.
593
594  -line
595
596The line number the error occurred in.
597
598  -file
599
600The file (name) the error occurred in. Defaults to the name of the current
601file being processed.
602
603  -severity
604
605The error level, should be 'WARNING' or 'ERROR'.
606
607=cut
608
609# Invoked as $self->poderror( @args ), or $self->poderror( {%opts}, @args )
610sub poderror {
611    my $self = shift;
612    my %opts = (ref $_[0]) ? %{shift()} : ();
613
614    ## Retrieve options
615    chomp( my $msg  = ($opts{'-msg'} || '')."@_" );
616    my $line = (exists $opts{'-line'}) ? " at line $opts{'-line'}" : '';
617    my $file = ' in file ' . ((exists $opts{'-file'})
618                              ? $opts{'-file'}
619                              : ((defined $self->source_filename)
620                                 ? $self->source_filename
621                                 : "???"));
622    unless (exists $opts{'-severity'}) {
623       ## See if can find severity in message prefix
624       $opts{'-severity'} = $1  if ( $msg =~ s/^\**\s*([A-Z]{3,}):\s+// );
625    }
626    my $severity = (exists $opts{'-severity'}) ? "*** $opts{-severity}: " : '';
627
628    ## Increment error count and print message "
629    ++($self->{'_NUM_ERRORS'})
630        if(!%opts || ($opts{-severity} && $opts{'-severity'} eq 'ERROR'));
631    ++($self->{'_NUM_WARNINGS'})
632        if(!%opts || ($opts{-severity} && $opts{'-severity'} eq 'WARNING'));
633    unless($self->{'-quiet'}) {
634      my $out_fh = $self->{'output_fh'} || \*STDERR;
635      print $out_fh ($severity, $msg, $line, $file, "\n")
636        if($self->{'-warnings'} || !%opts || $opts{'-severity'} ne 'WARNING');
637    }
638}
639
640##################################
641
642=item C<$checker-E<gt>num_errors()>
643
644Set (if argument specified) and retrieve the number of errors found.
645
646=cut
647
648sub num_errors {
649   return (@_ > 1) ? ($_[0]->{'_NUM_ERRORS'} = $_[1]) : $_[0]->{'_NUM_ERRORS'};
650}
651
652##################################
653
654=item C<$checker-E<gt>num_warnings()>
655
656Set (if argument specified) and retrieve the number of warnings found.
657
658=cut
659
660sub num_warnings {
661   return (@_ > 1) ? ($_[0]->{'_NUM_WARNINGS'} = $_[1]) :
662                      $_[0]->{'_NUM_WARNINGS'};
663}
664
665##################################
666
667=item C<$checker-E<gt>name()>
668
669Set (if argument specified) and retrieve the canonical name of POD as
670found in the C<=head1 NAME> section.
671
672=cut
673
674sub name {
675    return (@_ > 1 && $_[1]) ?
676        ($_[0]->{'_pod_name'} = $_[1]) : $_[0]->{'_pod_name'};
677}
678
679##################################
680
681=item C<$checker-E<gt>node()>
682
683Add (if argument specified) and retrieve the nodes (as defined by C<=headX>
684and C<=item>) of the current POD. The nodes are returned in the order of
685their occurrence. They consist of plain text, each piece of whitespace is
686collapsed to a single blank.
687
688=cut
689
690sub node {
691    my ($self,$text) = @_;
692    if(defined $text) {
693        $text =~ s/\s+$//s; # strip trailing whitespace
694        $text =~ s/\s+/ /gs; # collapse whitespace
695        # add node, order important!
696        push(@{$self->{'_nodes'}}, $text);
697        # keep also a uniqueness counter
698        $self->{'_unique_nodes'}->{$text}++ if($text !~ /^\s*$/s);
699        return $text;
700    }
701    @{$self->{'_nodes'}};
702}
703
704##################################
705
706=item C<$checker-E<gt>idx()>
707
708Add (if argument specified) and retrieve the index entries (as defined by
709C<XE<lt>E<gt>>) of the current POD. They consist of plain text, each piece
710of whitespace is collapsed to a single blank.
711
712=cut
713
714# set/return index entries of current POD
715sub idx {
716    my ($self,$text) = @_;
717    if(defined $text) {
718        $text =~ s/\s+$//s; # strip trailing whitespace
719        $text =~ s/\s+/ /gs; # collapse whitespace
720        # add node, order important!
721        push(@{$self->{'_index'}}, $text);
722        # keep also a uniqueness counter
723        $self->{'_unique_nodes'}->{$text}++ if($text !~ /^\s*$/s);
724        return $text;
725    }
726    @{$self->{'_index'}};
727}
728
729##################################
730
731# add a hyperlink to the list of those of the current POD; returns current
732# list after the addition has been done
733sub hyperlink {
734    my $self = shift;
735    push(@{$self->{'_links'}}, $_[0]);
736    return $_[0];
737}
738
739=item C<$checker-E<gt>hyperlinks()>
740
741Retrieve an array containing the hyperlinks to things outside
742the current POD (as defined by C<LE<lt>E<gt>>).
743
744Each is an instance of a class with the following methods:
745
746=cut
747
748sub hyperlinks {
749    @{shift->{'_links'}};
750}
751
752##################################
753
754# override Pod::Simple's whine() and scream() to use poderror()
755
756# Note:
757# Ignore $self->{'no_whining'} b/c $self->{'quiet'} takes care of it in poderror
758# Don't bother incrementing $self->{'errors_seen'} -- it's not used
759# Don't bother pushing to $self->{'errata'} b/c poderror() outputs immediately
760# We don't need to set $self->no_errata_section(1) b/c of these overrides
761
762
763sub whine {
764    my ($self, $line, $complaint) = @_;
765
766    my $severity = 'ERROR';
767
768    if (0) {
769      # XXX: Let's standardize what's a warning and what's an error.  Let's not
770      # move stuff up and down the severity tree.  -- rjbs, 2013-04-12
771      # Convert errors in Pod::Simple that are warnings in Pod::Checker
772      # XXX Do differently so the $complaint can be reworded without this breaking
773      $severity = 'WARNING' if
774          $complaint =~ /^Expected '=item .+?'$/ ||
775          $complaint =~ /^You can't have =items \(as at line .+?\) unless the first thing after the =over is an =item$/ ||
776          $complaint =~ /^You have '=item .+?' instead of the expected '=item .+?'$/;
777    }
778
779    # rt.cpan.org #98326 - errors about Z<> ("non-empty")
780    $severity = 'WARNING' if $complaint =~ /\bZ\<\>/;
781
782    $self->poderror({ -line => $line,
783                      -severity => $severity,
784                      -msg => $complaint });
785
786    return 1; # assume everything is peachy keen
787}
788
789sub scream {
790    my ($self, $line, $complaint) = @_;
791
792    $self->poderror({ -line => $line,
793                      -severity => 'ERROR', # consider making severity 'FATAL'
794                      -msg => $complaint });
795
796    return 1;
797}
798
799
800##################################
801
802# Some helper subroutines
803
804sub _init_event { # assignments done at the start of most events
805    $_[0]{'_thispara'} = '';
806    $_[0]{'_line'} = $_[1]{'start_line'};
807    $_[0]{'_cmds_since_head'}++;
808}
809
810sub _check_fcode {
811    my ($self, $inner, $outers) = @_;
812    # Check for an fcode inside another of the same fcode
813    # XXX line number is the line of the start of the paragraph that the warning
814    # is in, not the line that the warning is on. Fix this
815
816    # Later versions of Pod::Simple forbid nested L<>'s
817    return if $inner eq 'L' && $Pod::Simple::VERSION ge '3.33';
818
819    if (grep { $_ eq $inner } @$outers) {
820        $self->poderror({ -line => $self->{'_line'},
821                          -severity => 'WARNING',
822                          -msg => "nested commands $inner<...$inner<...>...>"});
823    }
824}
825
826##################################
827
828sub handle_text { $_[0]{'_thispara'} .= $_[1] }
829
830# whiteline is a seemingly blank line that matches /[^\S\r\n]/
831sub handle_whiteline {
832    my ($line, $line_n, $self) = @_;
833    $self->poderror({
834        -line => $line_n,
835        -severity => 'WARNING',
836        -msg => 'line containing nothing but whitespace in paragraph'});
837}
838
839######## Directives
840sub handle_pod_and_cut {
841    my ($line, $line_n, $self) = @_;
842    $self->{'_cmds_since_head'}++;
843    if ($line =~ /=(pod|cut)\s+\S/) {
844        $self->poderror({ -line => $line_n,
845                          -severity => 'ERROR',
846                          -msg => "Spurious text after =$1"});
847    }
848}
849
850sub start_Para { shift->_init_event(@_); }
851sub end_Para   {
852    my $self = shift;
853    # Get the NAME of the pod document
854    if ($self->{'_head_num'} == 1 && $self->{'_head_text'} eq 'NAME') {
855        if ($self->{'_thispara'} =~ /^\s*(\S+?)\s*[,-]/) {
856            $self->{'_pod_name'} = $1 unless defined $self->{'_pod_name'};
857        }
858    }
859}
860
861sub start_Verbatim {
862    my $self = shift;
863    $self->_init_event(@_);
864
865    if ($self->{'_head_num'} == 1 && $self->{'_head_text'} eq 'NAME') {
866        $self->poderror({ -line => $self->{'_line'},
867                          -severity => 'WARNING',
868                          -msg => 'Verbatim paragraph in NAME section' });
869    }
870}
871# Don't need an end_Verbatim
872
873# Do I need to do anything else with this?
874sub start_Data { shift->_init_event() }
875
876sub start_head1 { shift->start_head(1, @_) }
877sub start_head2 { shift->start_head(2, @_) }
878sub start_head3 { shift->start_head(3, @_) }
879sub start_head4 { shift->start_head(4, @_) }
880sub start_head  {
881    my $self = shift;
882    my $h = shift;
883    $self->_init_event(@_);
884    my $prev_h = $self->{'_head_num'};
885    $self->{'_head_num'} = $h;
886    $self->{"_count_head$h"}++;
887
888    if ($h > 1 && !$self->{'_count_head'.($h-1)}) {
889        $self->poderror({ -line => $self->{'_line'},
890                          -severity => 'WARNING',
891                          -msg => "=head$h without preceding higher level"});
892    }
893
894    # If this is the first =head of the doc, $prev_h is 0, thus less than $h
895    if ($self->{'_cmds_since_head'} == 1 && $prev_h >= $h) {
896        $self->poderror({ -line => $self->{'_line'},
897                          -severity => 'WARNING',
898                          -msg => 'empty section in previous paragraph'});
899    }
900}
901
902sub end_head1 { shift->end_head(@_) }
903sub end_head2 { shift->end_head(@_) }
904sub end_head3 { shift->end_head(@_) }
905sub end_head4 { shift->end_head(@_) }
906sub end_head  {
907    my $self = shift;
908    my $arg = $self->{'_thispara'};
909    $arg =~ s/\s+$//;
910    $self->{'_head_text'} = $arg;
911    $self->{'_cmds_since_head'} = 0;
912    my $h = $self->{'_head_num'};
913    $self->node($arg); # remember this node
914    if ($arg eq '') {
915        $self->poderror({ -line => $self->{'_line'},
916                          -severity => 'ERROR',
917                          -msg => "empty =head$h" });
918    }
919}
920
921sub start_over_bullet { shift->start_over(@_, 'bullet') }
922sub start_over_number { shift->start_over(@_, 'number') }
923sub start_over_text   { shift->start_over(@_, 'definition') }
924sub start_over_block  { shift->start_over(@_, 'block') }
925sub start_over_empty  {
926    my $self = shift;
927    $self->start_over(@_, 'empty');
928    $self->poderror({ -line => $self->{'_line'},
929                      -severity => 'WARNING',
930                      -msg => 'empty =over/=back block' });
931}
932sub start_over {
933    my $self = shift;
934    my $type = pop;
935    $self->_init_event(@_);
936}
937
938sub start_item_bullet { shift->_init_event(@_) }
939sub start_item_number { shift->_init_event(@_) }
940sub start_item_text   { shift->_init_event(@_) }
941sub end_item_bullet { shift->end_item('bullet') }
942sub end_item_number { shift->end_item('number') }
943sub end_item_text   { shift->end_item('definition') }
944sub end_item {
945    my $self = shift;
946    my $type = shift;
947    # If there is verbatim text in this item, it will show up as part of
948    # 'paras', and not part of '_thispara'.  If the first para after this is a
949    # verbatim one, it actually will be (part of) the contents for this item.
950    if (   $self->{'_thispara'} eq ''
951        && (  ! @{$self->{'paras'}}
952            ||    $self->{'paras'}[0][0] !~ /Verbatim/i))
953    {
954        $self->poderror({ -line => $self->{'_line'},
955                          -severity => 'WARNING',
956                          -msg => '=item has no contents' });
957    }
958
959    $self->node($self->{'_thispara'}); # remember this node
960}
961
962sub start_for { # =for and =begin directives
963    my ($self, $flags) = @_;
964    $self->_init_event($flags);
965    push @{$self->{'_begin_stack'}}, [$self->{'_line'}, $flags->{'target'}];
966}
967
968sub end_for {
969    my ($self, $flags) = @_;
970    my ($line, $target) = @{pop @{$self->{'_begin_stack'}}};
971    if ($flags->{'fake-closer'}) { # meaning Pod::Simple generated this =end
972        $self->poderror({ -line => $line,
973                          -severity => 'ERROR',
974                          -msg => "=begin $target without matching =end $target"
975                        });
976    }
977}
978
979sub end_Document {
980    # Some final error checks
981    my $self = shift;
982
983    # no POD found here
984    $self->num_errors(-1) && return unless $self->content_seen;
985
986    my %nodes;
987    for ($self->node()) {
988        $nodes{$_} = 1;
989        if(/^(\S+)\s+\S/) {
990            # we have more than one word. Use the first as a node, too.
991            # This is used heavily in perlfunc.pod
992            $nodes{$1} ||= 2; # derived node
993        }
994    }
995    for ($self->idx()) {
996        $nodes{$_} = 3; # index node
997    }
998
999    # XXX update unresolved internal link POD -- single word not enclosed in ""?
1000    # I don't know what I was thinking when I made the above TODO, and I don't
1001    # know what it means...
1002
1003    for my $link (@{ $self->{'_internal_links'} }) {
1004        my ($name, $line) = @$link;
1005        unless ( $nodes{$name} ) {
1006            $self->poderror({ -line => $line,
1007                              -severity => 'ERROR',
1008                              -msg => "unresolved internal link '$name'"});
1009        }
1010    }
1011
1012    # check the internal nodes for uniqueness. This pertains to
1013    # =headX, =item and X<...>
1014    if ($self->{'-warnings'} > 1 ) {
1015        for my $node (sort keys %{ $self->{'_unique_nodes'} }) {
1016            my $count = $self->{'_unique_nodes'}{$node};
1017            if ($count > 1) { # not unique
1018                $self->poderror({
1019                    -line => '-',
1020                    -severity => 'WARNING',
1021                    -msg => "multiple occurrences ($count) of link target ".
1022                        "'$node'"});
1023            }
1024        }
1025    }
1026}
1027
1028########  Formatting codes
1029
1030sub start_B { shift->start_fcode('B') }
1031sub start_C { shift->start_fcode('C') }
1032sub start_F { shift->start_fcode('F') }
1033sub start_I { shift->start_fcode('I') }
1034sub start_S { shift->start_fcode('S') }
1035sub start_fcode {
1036    my ($self, $fcode) = @_;
1037    unshift @{$self->{'_fcode_stack'}}, $fcode;
1038}
1039
1040sub end_B { shift->end_fcode() }
1041sub end_C { shift->end_fcode() }
1042sub end_F { shift->end_fcode() }
1043sub end_I { shift->end_fcode() }
1044sub end_S { shift->end_fcode() }
1045sub end_fcode {
1046    my $self = shift;
1047    $self->_check_fcode(shift @{$self->{'_fcode_stack'}}, # current fcode removed
1048                        $self->{'_fcode_stack'}); # previous fcodes
1049}
1050
1051sub start_L {
1052    my ($self, $flags) = @_;
1053    $self->start_fcode('L');
1054
1055    my $link = Pod::Checker::Hyperlink->new($flags, $self);
1056    if ($link) {
1057        if (   $link->type eq 'pod'
1058            && $link->node
1059                # It's an internal-to-this-page link if no page is given, or
1060                # if the given one is to our NAME.
1061            && (! $link->page || (   $self->{'_pod_name'}
1062                                  && $link->page eq $self->{'_pod_name'})))
1063        {
1064            push @{ $self->{'_internal_links'} }, [ $link->{'-raw_node'}, $link->line ];
1065        }
1066        else {
1067            $self->hyperlink($link);
1068        }
1069    }
1070}
1071
1072sub end_L {
1073    my $self = shift;
1074    $self->end_fcode();
1075}
1076
1077sub start_X {
1078    my $self = shift;
1079    $self->start_fcode('X');
1080    # keep track of where X<> starts in the paragraph
1081    # (this is a stack so nested X<>s are handled correctly)
1082    push @{$self->{'_fcode_pos'}}, length $self->{'_thispara'};
1083}
1084sub end_X {
1085    my $self = shift;
1086    # extract contents of X<> and replace with ''
1087    my $start = pop @{$self->{'_fcode_pos'}}; # start at the beginning of X<>
1088    my $end = length($self->{'_thispara'}) - $start; # end at end of X<>
1089    my $x = substr($self->{'_thispara'}, $start, $end, '');
1090    if ($x eq "") {
1091        $self->poderror({ -line => $self->{'_line'},
1092                          -severity => 'ERROR',
1093                          -msg => "An empty X<>" });
1094    }
1095    $self->idx($x); # remember this node
1096    $self->end_fcode();
1097}
1098
1099package Pod::Checker::Hyperlink;
1100
1101# This class is used to represent L<> link structures, so that the individual
1102# elements are easily accessible.  It is based on code in Pod::Hyperlink
1103
1104sub new {
1105    my ($class,
1106        $simple_link,   # The link structure returned by Pod::Simple
1107        $caller         # The caller class
1108    ) = @_;
1109
1110    my $self = +{};
1111    bless $self, $class;
1112
1113    $self->{'-line'} ||= $caller->{'_line'};
1114    $self->{'-type'} ||= $simple_link->{'type'};
1115
1116    # Force stringification of page and node.  (This expands any E<>.)
1117    $self->{'-page'} = exists $simple_link->{'to'} ? "$simple_link->{'to'}" : "";
1118    $self->{'-node'} = exists $simple_link->{'section'} ? "$simple_link->{'section'}" : "";
1119
1120    # Save the unmodified node text, as the .t files are expecting the message
1121    # for internal link failures to include it (hence this preserves backward
1122    # compatibility).
1123    $self->{'-raw_node'} = $self->{'-node'};
1124
1125    # Remove leading/trailing white space.  Pod::Simple already warns about
1126    # these, so if the only error is this, and the link is otherwise correct,
1127    # only the Pod::Simple warning will be output, avoiding unnecessary
1128    # confusion.
1129    $self->{'-page'} =~ s/ ^ \s+ //x;
1130    $self->{'-page'} =~ s/ \s+ $ //x;
1131
1132    $self->{'-node'} =~ s/ ^ \s+ //x;
1133    $self->{'-node'} =~ s/ \s+ $ //x;
1134
1135    # Pod::Simple warns about L<> and L< >, but not L</>
1136    if ($self->{'-page'} eq "" && $self->{'-node'} eq "") {
1137        $caller->poderror({ -line => $caller->{'_line'},
1138                          -severity => 'WARNING',
1139                          -msg => 'empty link'});
1140        return;
1141    }
1142
1143    return $self;
1144}
1145
1146=item line()
1147
1148Returns the approximate line number in which the link was encountered
1149
1150=cut
1151
1152sub line {
1153    return $_[0]->{-line};
1154}
1155
1156=item type()
1157
1158Returns the type of the link; one of:
1159C<"url"> for things like
1160C<http://www.foo>, C<"man"> for man pages, or C<"pod">.
1161
1162=cut
1163
1164sub type {
1165    return  $_[0]->{-type};
1166}
1167
1168=item page()
1169
1170Returns the linked-to page or url.
1171
1172=cut
1173
1174sub page {
1175    return $_[0]->{-page};
1176}
1177
1178=item node()
1179
1180Returns the anchor or node within the linked-to page, or an empty string
1181(C<"">) if none appears in the link.
1182
1183=back
1184
1185=cut
1186
1187sub node {
1188    return $_[0]->{-node};
1189}
1190
1191=head1 AUTHOR
1192
1193Please report bugs using L<http://rt.cpan.org>.
1194
1195Brad Appleton E<lt>bradapp@enteract.comE<gt> (initial version),
1196Marek Rouchal E<lt>marekr@cpan.orgE<gt>,
1197Marc Green E<lt>marcgreen@cpan.orgE<gt> (port to Pod::Simple)
1198Ricardo Signes E<lt>rjbs@cpan.orgE<gt> (more porting to Pod::Simple)
1199Karl Williamson E<lt>khw@cpan.orgE<gt> (more porting to Pod::Simple)
1200
1201Based on code for B<Pod::Text::pod2text()> written by
1202Tom Christiansen E<lt>tchrist@mox.perl.comE<gt>
1203
1204=cut
1205
12061
1207