• Home
  • History
  • Annotate
Name Date Size #Lines LOC

..03-May-2022-

bin/H04-May-2020-5247

inc/IO/H04-May-2020-6031

lib/WWW/Mechanize/H04-May-2020-2,106982

t/H03-May-2022-2,8222,242

xt/H04-May-2020-693507

.gitignoreH A D04-May-2020161 1716

ChangesH A D04-May-202017.4 KiB417353

LICENSEH A D04-May-20208.7 KiB201151

MANIFESTH A D04-May-20201.3 KiB6564

MANIFEST.SKIPH A D04-May-2020293 2928

META.jsonH A D04-May-20202 KiB7776

META.ymlH A D04-May-20201.1 KiB4746

Makefile.PLH A D04-May-20207.4 KiB239184

READMEH A D04-May-20201.5 KiB6335

README.mkdnH A D04-May-202018.8 KiB755467

README

1WWW::Mechanize::Shell - An interactive shell for WWW::Mechanize
2
3DESCRIPTION
4
5This module implements a www-like shell above WWW::Mechanize
6and also has the capability to output crude Perl code that recreates
7the recorded session. Its main use is as an interactive starting point
8for automating a session through WWW::Mechanize.
9
10The cookie support is there, but no cookies are read from your existing
11browser sessions. See L<HTTP::Cookies> on how to implement reading/writing
12your current browsers cookies.
13
14
15INSTALLATION
16
17This is a Perl module distribution. It should be installed with whichever
18tool you use to manage your installation of Perl, e.g. any of
19
20  cpanm .
21  cpan  .
22  cpanp -i .
23
24Consult https://www.cpan.org/modules/INSTALL.html for further instruction.
25Should you wish to install this module manually, the procedure is
26
27  perl Makefile.PL
28  make
29  make test
30  make install
31
32
33REPOSITORY
34
35The public repository of this module is
36L<https://github.com/Corion/WWW-Mechanize-Shell>.
37
38SUPPORT
39
40The public support forum of this module is
41L<http://perlmonks.org/>.
42
43
44
45
46
47SEE ALSO
48
49L<WWW::Mechanize>,L<WWW::Mechanize::FormFiller>,L<WWW::Mechanize::Firefox>
50
51AUTHOR
52
53Max Maischein, E<lt>corion@cpan.orgE<gt>
54
55Please contact me if you find bugs or otherwise improve the module. More tests are also very welcome !
56
57
58COPYRIGHT AND LICENSE
59
60This library is free software; you can redistribute it and/or modify it under the same terms as Perl itself.
61
62Copyright (C) 2002-2020 Max Maischein
63

README.mkdn

1
2[![Travis Build Status](https://travis-ci.org/Corion/WWW-Mechanize-Shell.svg?branch=master)](https://travis-ci.org/Corion/WWW-Mechanize-Shell)
3[![AppVeyor Build Status](https://ci.appveyor.com/api/projects/status/github/Corion/WWW-Mechanize-Shell?branch=master&svg=true)](https://ci.appveyor.com/project/Corion/WWW-Mechanize-Shell)
4
5# NAME
6
7WWW::Mechanize::Shell - An interactive shell for WWW::Mechanize
8
9# SYNOPSIS
10
11From the command line as
12
13    perl -MWWW::Mechanize::Shell -eshell
14
15or alternatively as a custom shell program via :
16
17    #!/usr/bin/perl -w
18    use strict;
19    use WWW::Mechanize::Shell;
20
21    my $shell = WWW::Mechanize::Shell->new("shell");
22
23    if (@ARGV) {
24      $shell->source_file( @ARGV );
25    } else {
26      $shell->cmdloop;
27    };
28
29# DESCRIPTION
30
31This module implements a www-like shell above WWW::Mechanize
32and also has the capability to output crude Perl code that recreates
33the recorded session. Its main use is as an interactive starting point
34for automating a session through WWW::Mechanize.
35
36The cookie support is there, but no cookies are read from your existing
37browser sessions. See [HTTP::Cookies](https://metacpan.org/pod/HTTP::Cookies) on how to implement reading/writing
38your current browsers cookies.
39
40## `WWW::Mechanize::Shell->new %ARGS`
41
42This is the constructor for a new shell instance. Some of the options
43can be passed to the constructor as parameters.
44
45By default, a file `.mechanizerc` (respectively `mechanizerc` under Windows)
46in the users home directory is executed before the interactive shell loop is
47entered. This can be used to set some defaults. If you want to supply a different
48filename for the rcfile, the `rcfile` parameter can be passed to the constructor :
49
50    rcfile => '.myapprc',
51
52- **agent**
53
54        my $shell = WWW::Mechanize::Shell->new(
55            agent => WWW::Mechanize::Chrome->new(),
56        );
57
58    Pass in a premade custom user agent. This object must be compatible to
59    [WWW::Mechanize](https://metacpan.org/pod/WWW::Mechanize). Use this feature from the command line as
60
61        perl -Ilib -MWWW::Mechanize::Chrome \
62                   -MWWW::Mechanize::Shell \
63                   -e"shell(agent => WWW::Mechanize::Chrome->new())"
64
65## `$shell->release_agent`
66
67Since the shell stores a reference back to itself within the
68WWW::Mechanize instance, it is necessary to break this
69circular reference. This method does this.
70
71## `$shell->source_file FILENAME`
72
73The `source_file` method executes the lines of FILENAME
74as if they were typed in.
75
76    $shell->source_file( $filename );
77
78## `$shell->display_user_warning`
79
80All user warnings are routed through this routine
81so they can be rerouted / disabled easily.
82
83## `$shell->print_paged LIST`
84
85Prints the text in LIST using `$ENV{PAGER}`. If `$ENV{PAGER}`
86is empty, prints directly to `STDOUT`. Most of this routine
87comes from the `perldoc` utility.
88
89## `$shell->link_text LINK`
90
91Returns a meaningful text from a WWW::Mechanize::Link object. This is (in order of
92precedence) :
93
94    $link->text
95    $link->name
96    $link->url
97
98## `$shell->history`
99
100Returns the (relevant) shell history, that is, all commands
101that were not solely for the information of the user. The
102lines are returned as a list.
103
104    print join "\n", $shell->history;
105
106## `$shell->script`
107
108Returns the shell history as a Perl program. The
109lines are returned as a list. The lines do not have
110a one-by-one correspondence to the lines in the history.
111
112    print join "\n", $shell->script;
113
114## `$shell->status`
115
116`status` is called for status updates.
117
118## `$shell->display FILENAME LINES`
119
120`display` is called to output listings, currently from the
121`history` and `script` commands. If the second parameter
122is defined, it is the name of the file to be written,
123otherwise the lines are displayed to the user.
124
125# COMMANDS
126
127The shell implements various commands :
128
129## exit
130
131Leaves the shell.
132
133## restart
134
135Restart the shell.
136
137This is mostly useful when you are modifying the shell itself. It dosen't
138work if you use the shell in oneliner mode with `-e`.
139
140## get
141
142Download a specific URL.
143
144This is used as the entry point in all sessions
145
146Syntax:
147
148    get URL
149
150## save
151
152Download a link into a file.
153
154If more than one link matches the RE, all matching links are
155saved. The filename is taken from the last part of the
156URL. Alternatively, the number of a link may also be given.
157
158Syntax:
159
160    save RE
161
162## content
163
164Display the content for the current page.
165
166Syntax: content \[FILENAME\]
167
168If the FILENAME argument is provided, save the content to the file.
169
170A trailing "\\n" is added to the end of the content when using the
171shell, so this might not be ideally suited to save binary files without
172manual editing of the produced script.
173
174## title
175
176Display the current page title as found
177in the `<TITLE>` tag.
178
179## headers
180
181Prints all `<H1>` through `<H5>` strings found in the content,
182indented accordingly.  With an argument, prints only those
183levels; e.g., `headers 145` prints H1,H4,H5 strings only.
184
185## ua
186
187Get/set the current user agent
188
189Syntax:
190
191    # fake Internet Explorer
192    ua "Mozilla/4.0 (compatible; MSIE 4.01; Windows 98)"
193
194    # fake QuickTime v5
195    ua "QuickTime (qtver=5.0.2;os=Windows NT 5.0Service Pack 2)"
196
197    # fake Mozilla/Gecko based
198    ua "Mozilla/5.001 (windows; U; NT4.0; en-us) Gecko/25250101"
199
200    # set empty user agent :
201    ua ""
202
203## links
204
205Display all links on a page
206
207The links numbers displayed can used by `open` to directly
208select a link to follow.
209
210## images
211
212Display images on a page
213
214## parse
215
216Dump the output of HTML::TokeParser of the current content
217
218## forms
219
220Display all forms on the current page.
221
222## form
223
224Select the form named NAME
225
226If NAME matches `/^\d+$/`, it is assumed to be the (1-based) index
227of the form to select. There is no way of selecting a numerically
228named form by its name.
229
230## dump
231
232Dump the values of the current form
233
234## value
235
236Set a form value
237
238Syntax:
239
240    value NAME [VALUE]
241
242## tick
243
244Set checkbox marks
245
246Syntax:
247
248    tick NAME VALUE(s)
249
250If no value is given, all boxes are checked.
251
252## untick
253
254Remove checkbox marks
255
256Syntax:
257
258    untick NAME VALUE(s)
259
260If no value is given, all marks are removed.
261
262## submit
263
264submits the form without clicking on any button
265
266## click
267
268Clicks on the button named NAME.
269
270No regular expression expansion is done on NAME.
271
272Syntax:
273
274    click NAME
275
276If you have a button that has no name (displayed as NONAME),
277use
278
279    click ""
280
281to click on it.
282
283## open
284
285&lt;open> accepts one argument, which can be a regular expression or the number
286of a link on the page, starting at zero. These numbers are displayed by the
287`links` function. It goes directly to the page if a number is used
288or if the RE has one match. Otherwise, a list of links matching
289the regular expression is displayed.
290
291The regular expression should start and end with "/".
292
293Syntax:
294
295    open  [ RE | # ]
296
297## back
298
299Go back one page in the browser page history.
300
301## reload
302
303Repeat the last request, thus reloading the current page.
304
305Note that also POST requests are blindly repeated, as this command
306is mostly intended to be used when testing server side code.
307
308## browse
309
310Open the web browser with the current page
311
312Displays the current page in the browser.
313
314## set
315
316Set a shell option
317
318Syntax:
319
320    set OPTION [value]
321
322The command lists all valid options. Here is a short overview over
323the different options available :
324
325    autosync      - automatically synchronize the browser window
326    autorestart   - restart the shell when any required module changes
327                    This does not work with C<-e> oneliners.
328    watchfiles    - watch all required modules for changes
329    cookiefile    - the file where to store all cookies
330    dumprequests  - dump all requests to STDOUT
331    dumpresponses - dump the headers of the responses to STDOUT
332    verbose       - print commands to STDERR as they are run,
333                    when sourcing from a file
334
335## history
336
337Display your current session history as the relevant commands.
338
339Syntax:
340
341    history [FILENAME]
342
343Commands that have no influence on the browser state are not added
344to the history. If a parameter is given to the `history` command,
345the history is saved to that file instead of displayed onscreen.
346
347## script
348
349Display your current session history as a Perl script using WWW::Mechanize.
350
351Syntax:
352
353    script [FILENAME]
354
355If a parameter is given to the `script` command, the script is saved to
356that file instead of displayed on the console.
357
358This command was formerly known as `history`.
359
360## comment
361
362Adds a comment to the script and the history. The comment
363is prepended with a \\n to increase readability.
364
365## fillout
366
367Fill out the current form
368
369Interactively asks the values hat have no preset
370value via the autofill command.
371
372## auth
373
374Set basic authentication credentials.
375
376Syntax:
377
378    auth user password
379
380If you know the authority and the realm in advance, you can
381presupply the credentials, for example at the start of the script :
382
383        >auth corion secret
384        >get http://www.example.com
385        Retrieving http://www.example.com(200)
386        http://www.example.com>
387
388## table
389
390Display a table described by the columns COLUMNS.
391
392Syntax:
393
394    table COLUMNS
395
396Example:
397
398    table Product Price Description
399
400If there is a table on the current page that has in its first row the three
401columns `Product`, `Price` and `Description` (not necessarily in that order),
402the script will display these columns of the whole table.
403
404The `HTML::TableExtract` module is needed for this feature.
405
406## tables
407
408Display a list of tables.
409
410Syntax:
411
412    tables
413
414This command will display the top row for every
415table on the current page. This is convenient if you want
416to find out what the exact spellings for each column are.
417
418The command does not always work nice, for example if a
419site uses tables for layout, it will be harder to guess
420what tables are irrelevant and what tables are relevant.
421
422[HTML::TableExtract](https://metacpan.org/pod/HTML::TableExtract) is needed for this feature.
423
424## cookies
425
426Set the cookie file name
427
428Syntax:
429
430    cookies FILENAME
431
432## autofill
433
434Define an automatic value
435
436Sets a form value to be filled automatically. The NAME parameter is
437the WWW::Mechanize::FormFiller::Value subclass you want to use. For
438session fields, `Keep` is a good candidate, for interactive stuff,
439`Ask` is a value implemented by the shell.
440
441A field name starting and ending with a slash (`/`) is taken to be
442a regular expression and will be applied to all fields with their
443name matching the expression. A field with a matching name still
444takes precedence over the regular expression.
445
446Syntax:
447
448    autofill NAME [PARAMETERS]
449
450Examples:
451
452    autofill login Fixed corion
453    autofill password Ask
454    autofill selection Random red green orange
455    autofill session Keep
456    autofill "/date$/" Random::Date string "%m/%d/%Y"
457
458## eval
459
460Evaluate Perl code and print the result
461
462Syntax:
463
464    eval CODE
465
466For the generated scripts, anything matching the regular expression
467`/\$self->agent\b/` is automatically
468replaced by `$agent` in your eval code, to do the Right Thing.
469
470Examples:
471
472    # Say hello
473    eval "Hello World"
474
475    # And take a look at the current content type
476    eval $self->agent->ct
477
478## source
479
480Execute a batch of commands from a file
481
482Syntax:
483
484    source FILENAME
485
486## versions
487
488Print the version numbers of important modules
489
490Syntax:
491
492    versions
493
494## timeout
495
496Set new timeout value for the agent. Effects all subsequent
497requests. VALUE is in seconds.
498
499Syntax:
500
501    timeout VALUE
502
503## ct
504
505prints the content type of the most current response.
506
507Syntax:
508
509    ct
510
511## referrer
512
513set the value of the Referer: header
514
515Syntax:
516
517    referer URL
518    referrer URL
519
520## referer
521
522Alias for referrer
523
524## response
525
526display the last server response
527
528## `$shell->munge_code( CODE )`
529
530Munges a coderef to become code fit for
531output independent of WWW::Mechanize::Shell.
532
533## `shell`
534
535This subroutine is exported by default as a convenience method
536so that the following oneliner invocation works:
537
538    perl -MWWW::Mechanize::Shell -eshell
539
540You can pass constructor arguments to this
541routine as well. Any scripts given in `@ARGV`
542will be run. If `@ARGV` is empty,
543an interactive loop will be started.
544
545# SAMPLE SESSIONS
546
547## Entering values
548
549    # Search for a term on Google
550    get http://www.google.com
551    value q "Corions Homepage"
552    click btnG
553    script
554    # (yes, this is a bad example of automating, as Google
555    #  already has a Perl API. But other sites don't)
556
557## Retrieving a table
558
559    get http://www.perlmonks.org
560    open "/Saints in/"
561    table User Experience Level
562    script
563    # now you have a program that gives you a csv file of
564    # that table.
565
566## Uploading a file
567
568    get http://aliens:xxxxx/
569    value f path/to/file
570    click "upload"
571
572## Batch download
573
574    # download prerelease versions of my modules
575    get http://www.corion.net/perl-dev
576    save /.tar.gz$/
577
578# REGULAR EXPRESSION SYNTAX
579
580Some commands take regular expressions as parameters. A regular
581expression **must** be a single parameter matching `^/.*/([isxm]+)?$`, so
582you have to use quotes around it if the expression contains spaces :
583
584    /link_foo/       # will match as (?-xims:link_foo)
585    "/link foo/"     # will match as (?-xims:link foo)
586
587Slashes do not need to be escaped, as the shell knows that a RE starts and
588ends with a slash :
589
590    /link/foo/       # will match as (?-xims:link/foo)
591    "/link/ /foo/"   # will match as (?-xims:link/\s/foo)
592
593The `/i` modifier works as expected.
594If you desire more power over the regular expressions, consider dropping
595to Perl or recommend me a good parser module for regular expressions.
596
597# DISPLAYING HTML
598
599WWW::Mechanize::Shell now uses the module HTML::Display
600to display the HTML of the current page in your browser.
601Have a look at the documentation of HTML::Display how to
602make it use your browser of choice in the case it does not
603already guess it correctly.
604
605# FILLING FORMS VIA CUSTOM CODE
606
607If you want to stay within the confines of the shell, but still
608want to fill out forms using custom Perl code, here is a recipe
609how to achieve this :
610
611Code passed to the `eval` command gets evalutated in the WWW::Mechanize::Shell
612namespace. You can inject new subroutines there and these get picked
613up by the Callback class of WWW::Mechanize::FormFiller :
614
615    # Fill in the "date" field with the current date/time as string
616    eval sub &::custom_today { scalar localtime };
617    autofill date Callback WWW::Mechanize::Shell::custom_today
618    fillout
619
620This method can also be used to retrieve data from shell scripts :
621
622    # Fill in the "date" field with the current date/time as string
623    # works only if there is a program "date"
624    eval sub &::custom_today { chomp `date` };
625    autofill date Callback WWW::Mechanize::Shell::custom_today
626    fillout
627
628As the namespace is different between the shell and the generated
629script, make sure you always fully qualify your subroutine names,
630either in your own namespace or in the main namespace.
631
632# GENERATED SCRIPTS
633
634The `script` command outputs a skeleton script that reproduces
635your actions as done in the current session. It pulls in
636`WWW::Mechanize::FormFiller`, which is possibly not needed. You
637should add some error and connection checking afterwards.
638
639# ADDING FIELDS TO HTML
640
641If you are automating a JavaScript dependent site, you will encounter
642JavaScript like this :
643
644    <script>
645      document.write( "<input type=submit name=submit>" );
646    </script>
647
648HTML::Form will not know about this and will not have provided a
649submit button for you (understandably). If you want to create such
650a submit button from within your automation script, use the following
651code :
652
653    $agent->current_form->push_input( submit => { name => "submit", value =>"submit" } );
654
655This also works for other dynamically generated input fields.
656
657To fake an input field from within a shell session, use the `eval` command :
658
659    eval $self->agent->current_form->push_input(submit=>{name=>"submit",value=>"submit"});
660
661And yes, the generated script should do the Right Thing for this eval as well.
662
663# LOCAL FILES
664
665If you want to use the shell on a local file without setting up a `http` server
666to serve the file, you can use the `file:` URI scheme to load it into the "browser":
667
668    get file:local.html
669    forms
670
671# PROXY SUPPORT
672
673Currently, the proxy support is realized via a call to
674the `env_proxy` method of the WWW::Mechanize object, which
675loads the proxies from the environment. There is no provision made
676to prevent using proxies (yet). The generated scripts also
677load their proxies from the environment.
678
679# ONLINE HELP
680
681The online help feature is currently a bit broken in `Term::Shell`,
682but a fix is in the works. Until then, you can re-enable the
683dynamic online help by patching `Term::Shell` :
684
685Remove the three lines
686
687      my $smry = exists $o->{handlers}{$h}{smry}
688    ? $o->summary($h)
689    : "undocumented";
690
691in `sub run_help` and replace them by
692
693      my $smry = $o->summary($h);
694
695The shell works without this patch and the online help is still
696available through `perldoc WWW::Mechanize::Shell`
697
698# BUGS
699
700Bug reports are very welcome - please use the RT interface at
701https://rt.cpan.org/NoAuth/Bugs.html?Dist=WWW-Mechanize-Shell or send a
702descriptive mail to bug-WWW-Mechanize-Shell@rt.cpan.org . Please
703try to include as much (relevant) information as possible - a test script
704that replicates the undesired behaviour is welcome every time!
705
706- The two parameter version of the `auth` command guesses the realm from
707the last received response. Currently a RE is used to extract the realm,
708but this fails with some servers resp. in some cases. Use the four
709parameter version of `auth`, or if not possible, code the extraction
710in Perl, either in the final script or through `eval` commands.
711- The shell currently detects when you want to follow a JavaScript link and tells you
712that this is not supported. It would be nicer if there was some callback mechanism
713to (automatically?) extract URLs from JavaScript-infected links.
714
715# TODO
716
717- Add XPath expressions (by moving `WWW::Mechanize` from HTML::Parser to XML::XMLlib
718or maybe easier, by tacking Class::XPath onto an HTML tree)
719- Add `head` as a command ?
720- Optionally silence the HTML::Parser / HTML::Forms warnings about invalid HTML.
721
722# EXPORT
723
724The routine `shell` is exported into the importing namespace. This
725is mainly for convenience so you can use the following commandline
726invocation of the shell like with CPAN :
727
728    perl -MWWW::Mechanize::Shell -e"shell"
729
730# REPOSITORY
731
732The public repository of this module is
733[https://github.com/Corion/WWW-Mechanize-Shell](https://github.com/Corion/WWW-Mechanize-Shell).
734
735# SUPPORT
736
737The public support forum of this module is
738[http://perlmonks.org/](http://perlmonks.org/).
739
740# COPYRIGHT AND LICENSE
741
742This library is free software; you can redistribute it and/or modify it under the same terms as Perl itself.
743
744Copyright (C) 2002-2020 Max Maischein
745
746# AUTHOR
747
748Max Maischein, <corion@cpan.org>
749
750Please contact me if you find bugs or otherwise improve the module. More tests are also very welcome !
751
752# SEE ALSO
753
754[WWW::Mechanize](https://metacpan.org/pod/WWW::Mechanize),[WWW::Mechanize::FormFiller](https://metacpan.org/pod/WWW::Mechanize::FormFiller),[WWW::Mechanize::Firefox](https://metacpan.org/pod/WWW::Mechanize::Firefox)
755