1=head1 NAME 2 3Test::Harness::Beyond - Beyond make test 4 5=head1 Beyond make test 6 7Test::Harness is responsible for running test scripts, analysing 8their output and reporting success or failure. When I type 9F<make test> (or F<./Build test>) for a module, Test::Harness is usually 10used to run the tests (not all modules use Test::Harness but the 11majority do). 12 13To start exploring some of the features of Test::Harness I need to 14switch from F<make test> to the F<prove> command (which ships with 15Test::Harness). For the following examples I'll also need a recent 16version of Test::Harness installed; 3.14 is current as I write. 17 18For the examples I'm going to assume that we're working with a 19'normal' Perl module distribution. Specifically I'll assume that 20typing F<make> or F<./Build> causes the built, ready-to-install module 21code to be available below ./blib/lib and ./blib/arch and that 22there's a directory called 't' that contains our tests. Test::Harness 23isn't hardwired to that configuration but it saves me from explaining 24which files live where for each example. 25 26Back to F<prove>; like F<make test> it runs a test suite - but it 27provides far more control over which tests are executed, in what 28order and how their results are reported. Typically F<make test> 29runs all the test scripts below the 't' directory. To do the same 30thing with prove I type: 31 32 prove -rb t 33 34The switches here are -r to recurse into any directories below 't' 35and -b which adds ./blib/lib and ./blib/arch to Perl's include path 36so that the tests can find the code they will be testing. If I'm 37testing a module of which an earlier version is already installed 38I need to be careful about the include path to make sure I'm not 39running my tests against the installed version rather than the new 40one that I'm working on. 41 42Unlike F<make test>, typing F<prove> doesn't automatically rebuild 43my module. If I forget to make before prove I will be testing against 44older versions of those files - which inevitably leads to confusion. 45I either get into the habit of typing 46 47 make && prove -rb t 48 49or - if I have no XS code that needs to be built I use the modules 50below F<lib> instead 51 52 prove -Ilib -r t 53 54So far I've shown you nothing that F<make test> doesn't do. Let's 55fix that. 56 57=head2 Saved State 58 59If I have failing tests in a test suite that consists of more than 60a handful of scripts and takes more than a few seconds to run it 61rapidly becomes tedious to run the whole test suite repeatedly as 62I track down the problems. 63 64I can tell prove just to run the tests that are failing like this: 65 66 prove -b t/this_fails.t t/so_does_this.t 67 68That speeds things up but I have to make a note of which tests are 69failing and make sure that I run those tests. Instead I can use 70prove's --state switch and have it keep track of failing tests for 71me. First I do a complete run of the test suite and tell prove to 72save the results: 73 74 prove -rb --state=save t 75 76That stores a machine readable summary of the test run in a file 77called '.prove' in the current directory. If I have failures I can 78then run just the failing scripts like this: 79 80 prove -b --state=failed 81 82I can also tell prove to save the results again so that it updates 83its idea of which tests failed: 84 85 prove -b --state=failed,save 86 87As soon as one of my failing tests passes it will be removed from 88the list of failed tests. Eventually I fix them all and prove can 89find no failing tests to run: 90 91 Files=0, Tests=0, 0 wallclock secs ( 0.00 usr + 0.00 sys = 0.00 CPU) 92 Result: NOTESTS 93 94As I work on a particular part of my module it's most likely that 95the tests that cover that code will fail. I'd like to run the whole 96test suite but have it prioritize these 'hot' tests. I can tell 97prove to do this: 98 99 prove -rb --state=hot,save t 100 101All the tests will run but those that failed most recently will be 102run first. If no tests have failed since I started saving state all 103tests will run in their normal order. This combines full test 104coverage with early notification of failures. 105 106The --state switch supports a number of options; for example to run 107failed tests first followed by all remaining tests ordered by the 108timestamps of the test scripts - and save the results - I can use 109 110 prove -rb --state=failed,new,save t 111 112See the prove documentation (type prove --man) for the full list 113of state options. 114 115When I tell prove to save state it writes a file called '.prove' 116('_prove' on Windows) in the current directory. It's a YAML document 117so it's quite easy to write tools of your own that work on the saved 118test state - but the format isn't officially documented so it might 119change without (much) warning in the future. 120 121=head2 Parallel Testing 122 123If my tests take too long to run I may be able to speed them up by 124running multiple test scripts in parallel. This is particularly 125effective if the tests are I/O bound or if I have multiple CPU 126cores. I tell prove to run my tests in parallel like this: 127 128 prove -rb -j 9 t 129 130The -j switch enables parallel testing; the number that follows it 131is the maximum number of tests to run in parallel. Sometimes tests 132that pass when run sequentially will fail when run in parallel. For 133example if two different test scripts use the same temporary file 134or attempt to listen on the same socket I'll have problems running 135them in parallel. If I see unexpected failures I need to check my 136tests to work out which of them are trampling on the same resource 137and rename temporary files or add locks as appropriate. 138 139To get the most performance benefit I want to have the test scripts 140that take the longest to run start first - otherwise I'll be waiting 141for the one test that takes nearly a minute to complete after all 142the others are done. I can use the --state switch to run the tests 143in slowest to fastest order: 144 145 prove -rb -j 9 --state=slow,save t 146 147=head2 Non-Perl Tests 148 149The Test Anything Protocol (http://testanything.org/) isn't just 150for Perl. Just about any language can be used to write tests that 151output TAP. There are TAP based testing libraries for C, C++, PHP, 152Python and many others. If I can't find a TAP library for my language 153of choice it's easy to generate valid TAP. It looks like this: 154 155 1..3 156 ok 1 - init OK 157 ok 2 - opened file 158 not ok 3 - appended to file 159 160The first line is the plan - it specifies the number of tests I'm 161going to run so that it's easy to check that the test script didn't 162exit before running all the expected tests. The following lines are 163the test results - 'ok' for pass, 'not ok' for fail. Each test has 164a number and, optionally, a description. And that's it. Any language 165that can produce output like that on STDOUT can be used to write 166tests. 167 168Recently I've been rekindling a two-decades-old interest in Forth. 169Evidently I have a masochistic streak that even Perl can't satisfy. 170I want to write tests in Forth and run them using prove (you can 171find my gforth TAP experiments at 172https://svn.hexten.net/andy/Forth/Testing/). I can use the --exec 173switch to tell prove to run the tests using gforth like this: 174 175 prove -r --exec gforth t 176 177Alternately, if the language used to write my tests allows a shebang 178line I can use that to specify the interpreter. Here's a test written 179in PHP: 180 181 #!/usr/bin/php 182 <?php 183 print "1..2\n"; 184 print "ok 1\n"; 185 print "not ok 2\n"; 186 ?> 187 188If I save that as t/phptest.t the shebang line will ensure that it 189runs correctly along with all my other tests. 190 191=head2 Mixing it up 192 193Subtle interdependencies between test programs can mask problems - 194for example an earlier test may neglect to remove a temporary file 195that affects the behaviour of a later test. To find this kind of 196problem I use the --shuffle and --reverse options to run my tests 197in random or reversed order. 198 199=head2 Rolling My Own 200 201If I need a feature that prove doesn't provide I can easily write my own. 202 203Typically you'll want to change how TAP gets I<input> into and I<output> 204from the parser. L<App::Prove> supports arbitrary plugins, and L<TAP::Harness> 205supports custom I<formatters> and I<source handlers> that you can load using 206either L<prove> or L<Module::Build>; there are many examples to base mine on. 207For more details see L<App::Prove>, L<TAP::Parser::SourceHandler>, and 208L<TAP::Formatter::Base>. 209 210If writing a plugin is not enough, you can write your own test harness; one of 211the motives for the 3.00 rewrite of Test::Harness was to make it easier to 212subclass and extend. 213 214The Test::Harness module is a compatibility wrapper around TAP::Harness. 215For new applications I should use TAP::Harness directly. As we'll 216see, prove uses TAP::Harness. 217 218When I run prove it processes its arguments, figures out which test 219scripts to run and then passes control to TAP::Harness to run the 220tests, parse, analyse and present the results. By subclassing 221TAP::Harness I can customise many aspects of the test run. 222 223I want to log my test results in a database so I can track them 224over time. To do this I override the summary method in TAP::Harness. 225I start with a simple prototype that dumps the results as a YAML 226document: 227 228 package My::TAP::Harness; 229 230 use base 'TAP::Harness'; 231 use YAML; 232 233 sub summary { 234 my ( $self, $aggregate ) = @_; 235 print Dump( $aggregate ); 236 $self->SUPER::summary( $aggregate ); 237 } 238 239 1; 240 241I need to tell prove to use my My::TAP::Harness. If My::TAP::Harness 242is on Perl's @INC include path I can 243 244 prove --harness=My::TAP::Harness -rb t 245 246If I don't have My::TAP::Harness installed on @INC I need to provide 247the correct path to perl when I run prove: 248 249 perl -Ilib `which prove` --harness=My::TAP::Harness -rb t 250 251I can incorporate these options into my own version of prove. It's 252pretty simple. Most of the work of prove is handled by App::Prove. 253The important code in prove is just: 254 255 use App::Prove; 256 257 my $app = App::Prove->new; 258 $app->process_args(@ARGV); 259 exit( $app->run ? 0 : 1 ); 260 261If I write a subclass of App::Prove I can customise any aspect of 262the test runner while inheriting all of prove's behaviour. Here's 263myprove: 264 265 #!/usr/bin/env perl use lib qw( lib ); # Add ./lib to @INC 266 use App::Prove; 267 268 my $app = App::Prove->new; 269 270 # Use custom TAP::Harness subclass 271 $app->harness( 'My::TAP::Harness' ); 272 273 $app->process_args( @ARGV ); exit( $app->run ? 0 : 1 ); 274 275Now I can run my tests like this 276 277 ./myprove -rb t 278 279=head2 Deeper Customisation 280 281Now that I know how to subclass and replace TAP::Harness I can 282replace any other part of the harness. To do that I need to know 283which classes are responsible for which functionality. Here's a 284brief guided tour; the default class for each component is shown 285in parentheses. Normally any replacements I write will be subclasses 286of these default classes. 287 288When I run my tests TAP::Harness creates a scheduler 289(TAP::Parser::Scheduler) to work out the running order for the 290tests, an aggregator (TAP::Parser::Aggregator) to collect and analyse 291the test results and a formatter (TAP::Formatter::Console) to display 292those results. 293 294If I'm running my tests in parallel there may also be a multiplexer 295(TAP::Parser::Multiplexer) - the component that allows multiple 296tests to run simultaneously. 297 298Once it has created those helpers TAP::Harness starts running the 299tests. For each test it creates a new parser (TAP::Parser) which 300is responsible for running the test script and parsing its output. 301 302To replace any of these components I call one of these harness 303methods with the name of the replacement class: 304 305 aggregator_class 306 formatter_class 307 multiplexer_class 308 parser_class 309 scheduler_class 310 311For example, to replace the aggregator I would 312 313 $harness->aggregator_class( 'My::Aggregator' ); 314 315Alternately I can supply the names of my substitute classes to the 316TAP::Harness constructor: 317 318 my $harness = TAP::Harness->new( 319 { aggregator_class => 'My::Aggregator' } 320 ); 321 322If I need to reach even deeper into the internals of the harness I 323can replace the classes that TAP::Parser uses to execute test scripts 324and tokenise their output. Before running a test script TAP::Parser 325creates a grammar (TAP::Parser::Grammar) to decode the raw TAP into 326tokens, a result factory (TAP::Parser::ResultFactory) to turn the 327decoded TAP results into objects and, depending on whether it's 328running a test script or reading TAP from a file, scalar or array 329a source or an iterator (TAP::Parser::IteratorFactory). 330 331Each of these objects may be replaced by calling one of these parser 332methods: 333 334 source_class 335 perl_source_class 336 grammar_class 337 iterator_factory_class 338 result_factory_class 339 340=head2 Callbacks 341 342As an alternative to subclassing the components I need to change I 343can attach callbacks to the default classes. TAP::Harness exposes 344these callbacks: 345 346 parser_args Tweak the parameters used to create the parser 347 made_parser Just made a new parser 348 before_runtests About to run tests 349 after_runtests Have run all tests 350 after_test Have run an individual test script 351 352TAP::Parser also supports callbacks; bailout, comment, plan, test, 353unknown, version and yaml are called for the corresponding TAP 354result types, ALL is called for all results, ELSE is called for all 355results for which a named callback is not installed and EOF is 356called once at the end of each TAP stream. 357 358To install a callback I pass the name of the callback and a subroutine 359reference to TAP::Harness or TAP::Parser's callback method: 360 361 $harness->callback( after_test => sub { 362 my ( $script, $desc, $parser ) = @_; 363 } ); 364 365I can also pass callbacks to the constructor: 366 367 my $harness = TAP::Harness->new({ 368 callbacks => { 369 after_test => sub { 370 my ( $script, $desc, $parser ) = @_; 371 # Do something interesting here 372 } 373 } 374 }); 375 376When it comes to altering the behaviour of the test harness there's 377more than one way to do it. Which way is best depends on my 378requirements. In general if I only want to observe test execution 379without changing the harness' behaviour (for example to log test 380results to a database) I choose callbacks. If I want to make the 381harness behave differently subclassing gives me more control. 382 383=head2 Parsing TAP 384 385Perhaps I don't need a complete test harness. If I already have a 386TAP test log that I need to parse all I need is TAP::Parser and the 387various classes it depends upon. Here's the code I need to run a 388test and parse its TAP output 389 390 use TAP::Parser; 391 392 my $parser = TAP::Parser->new( { source => 't/simple.t' } ); 393 while ( my $result = $parser->next ) { 394 print $result->as_string, "\n"; 395 } 396 397Alternately I can pass an open filehandle as source and have the 398parser read from that rather than attempting to run a test script: 399 400 open my $tap, '<', 'tests.tap' 401 or die "Can't read TAP transcript ($!)\n"; 402 my $parser = TAP::Parser->new( { source => $tap } ); 403 while ( my $result = $parser->next ) { 404 print $result->as_string, "\n"; 405 } 406 407This approach is useful if I need to convert my TAP based test 408results into some other representation. See TAP::Convert::TET 409(http://search.cpan.org/dist/TAP-Convert-TET/) for an example of 410this approach. 411 412=head2 Getting Support 413 414The Test::Harness developers hang out on the tapx-dev mailing 415list[1]. For discussion of general, language independent TAP issues 416there's the tap-l[2] list. Finally there's a wiki dedicated to the 417Test Anything Protocol[3]. Contributions to the wiki, patches and 418suggestions are all welcome. 419 420[1] L<http://www.hexten.net/mailman/listinfo/tapx-dev> 421[2] L<http://testanything.org/mailman/listinfo/tap-l> 422[3] L<http://testanything.org/> 423