1=head1 NAME
2
3perlfaq9 - Web, Email and Networking
4
5=head1 VERSION
6
7version 5.20200523
8
9=head1 DESCRIPTION
10
11This section deals with questions related to running web sites,
12sending and receiving email as well as general networking.
13
14=head2 Should I use a web framework?
15
16Yes. If you are building a web site with any level of interactivity
17(forms / users / databases), you
18will want to use a framework to make handling requests
19and responses easier.
20
21If there is no interactivity then you may still want
22to look at using something like L<Template Toolkit|https://metacpan.org/module/Template>
23or L<Plack::Middleware::TemplateToolkit>
24so maintenance of your HTML files (and other assets) is easier.
25
26=head2 Which web framework should I use?
27X<framework> X<CGI.pm> X<CGI> X<Catalyst> X<Dancer>
28
29There is no simple answer to this question. Perl frameworks can run everything
30from basic file servers and small scale intranets to massive multinational
31multilingual websites that are the core to international businesses.
32
33Below is a list of a few frameworks with comments which might help you in
34making a decision, depending on your specific requirements. Start by reading
35the docs, then ask questions on the relevant mailing list or IRC channel.
36
37=over 4
38
39=item L<Catalyst>
40
41Strongly object-oriented and fully-featured with a long development history and
42a large community and addon ecosystem. It is excellent for large and complex
43applications, where you have full control over the server.
44
45=item L<Dancer2>
46
47Free of legacy weight, providing a lightweight and easy to learn API.
48Has a growing addon ecosystem. It is best used for smaller projects and
49very easy to learn for beginners.
50
51=item L<Mojolicious>
52
53Self-contained and powerful for both small and larger projects,
54with a focus on HTML5 and real-time web technologies such as WebSockets.
55
56=item L<Web::Simple>
57
58Strongly object-oriented and minimal, built for speed and intended
59as a toolkit for building micro web apps, custom frameworks or for tieing
60together existing Plack-compatible web applications with one central dispatcher.
61
62=back
63
64All of these interact with or use L<Plack> which is worth understanding
65the basics of when building a website in Perl (there is a lot of useful
66L<Plack::Middleware|https://metacpan.org/search?q=plack%3A%3Amiddleware>).
67
68=head2 What is Plack and PSGI?
69
70L<PSGI> is the Perl Web Server Gateway Interface Specification, it is
71a standard that many Perl web frameworks use, you should not need to
72understand it to build a web site, the part you might want to use is L<Plack>.
73
74L<Plack> is a set of tools for using the PSGI stack. It contains
75L<middleware|https://metacpan.org/search?q=plack%3A%3Amiddleware>
76components, a reference server and utilities for Web application frameworks.
77Plack is like Ruby's Rack or Python's Paste for WSGI.
78
79You could build a web site using L<Plack> and your own code,
80but for anything other than a very basic web site, using a web framework
81(that uses L<https://plackperl.org>) is a better option.
82
83=head2 How do I remove HTML from a string?
84
85Use L<HTML::Strip>, or L<HTML::FormatText> which not only removes HTML
86but also attempts to do a little simple formatting of the resulting
87plain text.
88
89=head2 How do I extract URLs?
90
91L<HTML::SimpleLinkExtor> will extract URLs from HTML, it handles anchors,
92images, objects, frames, and many other tags that can contain a URL.
93If you need anything more complex, you can create your own subclass of
94L<HTML::LinkExtor> or L<HTML::Parser>. You might even use
95L<HTML::SimpleLinkExtor> as an example for something specifically
96suited to your needs.
97
98You can use L<URI::Find> or L<URL::Search> to extract URLs from an
99arbitrary text document.
100
101=head2 How do I fetch an HTML file?
102
103(contributed by brian d foy)
104
105The core L<HTTP::Tiny> module can fetch web resources and give their
106content back to you as a string:
107
108    use HTTP::Tiny;
109
110    my $ua = HTTP::Tiny->new;
111    my $html = $ua->get( "http://www.example.com/index.html" )->{content};
112
113It can also store the resource directly in a file:
114
115    $ua->mirror( "http://www.example.com/index.html", "foo.html" );
116
117If you need to do something more complicated, the L<HTTP::Tiny> object can
118be customized by setting attributes, or you can use L<LWP::UserAgent> from
119the libwww-perl distribution or L<Mojo::UserAgent> from the Mojolicious
120distribution to make common tasks easier. If you want to simulate an
121interactive web browser, you can use the L<WWW::Mechanize> module.
122
123=head2 How do I automate an HTML form submission?
124
125If you are doing something complex, such as moving through many pages
126and forms or a web site, you can use L<WWW::Mechanize>. See its
127documentation for all the details.
128
129If you're submitting values using the GET method, create a URL and encode
130the form using the C<www_form_urlencode> method from L<HTTP::Tiny>:
131
132    use HTTP::Tiny;
133
134    my $ua = HTTP::Tiny->new;
135
136    my $query = $ua->www_form_urlencode([ q => 'DB_File', lucky => 1 ]);
137    my $url = "https://metacpan.org/search?$query";
138    my $content = $ua->get($url)->{content};
139
140If you're using the POST method, the C<post_form> method will encode the
141content appropriately.
142
143    use HTTP::Tiny;
144
145    my $ua = HTTP::Tiny->new;
146
147    my $url = 'https://metacpan.org/search';
148    my $form = [ q => 'DB_File', lucky => 1 ];
149    my $content = $ua->post_form($url, $form)->{content};
150
151=head2 How do I decode or create those %-encodings on the web?
152X<URI> X<URI::Escape> X<RFC 2396>
153
154Most of the time you should not need to do this as
155your web framework, or if you are making a request,
156the L<LWP> or other module would handle it for you.
157
158To encode a string yourself, use the L<URI::Escape> module. The C<uri_escape>
159function returns the escaped string:
160
161    my $original = "Colon : Hash # Percent %";
162
163    my $escaped = uri_escape( $original );
164
165    print "$escaped\n"; # 'Colon%20%3A%20Hash%20%23%20Percent%20%25'
166
167To decode the string, use the C<uri_unescape> function:
168
169    my $unescaped = uri_unescape( $escaped );
170
171    print $unescaped; # back to original
172
173Remember not to encode a full URI, you need to escape each
174component separately and then join them together.
175
176=head2 How do I redirect to another page?
177
178Most Perl Web Frameworks will have a mechanism for doing this,
179using the L<Catalyst> framework it would be:
180
181    $c->res->redirect($url);
182    $c->detach();
183
184If you are using Plack (which most frameworks do), then
185L<Plack::Middleware::Rewrite> is worth looking at if you
186are migrating from Apache or have URL's you want to always
187redirect.
188
189=head2 How do I put a password on my web pages?
190
191See if the web framework you are using has an
192authentication system and if that fits your needs.
193
194Alternativly look at L<Plack::Middleware::Auth::Basic>,
195or one of the other L<Plack authentication|https://metacpan.org/search?q=plack+auth>
196options.
197
198=head2 How do I make sure users can't enter values into a form that causes my CGI script to do bad things?
199
200(contributed by brian d foy)
201
202You can't prevent people from sending your script bad data. Even if
203you add some client-side checks, people may disable them or bypass
204them completely. For instance, someone might use a module such as
205L<LWP> to submit to your web site. If you want to prevent data that
206try to use SQL injection or other sorts of attacks (and you should
207want to), you have to not trust any data that enter your program.
208
209The L<perlsec> documentation has general advice about data security.
210If you are using the L<DBI> module, use placeholder to fill in data.
211If you are running external programs with C<system> or C<exec>, use
212the list forms. There are many other precautions that you should take,
213too many to list here, and most of them fall under the category of not
214using any data that you don't intend to use. Trust no one.
215
216=head2 How do I parse a mail header?
217
218Use the L<Email::MIME> module. It's well-tested and supports all the
219craziness that you'll see in the real world (comment-folding whitespace,
220encodings, comments, etc.).
221
222  use Email::MIME;
223
224  my $message = Email::MIME->new($rfc2822);
225  my $subject = $message->header('Subject');
226  my $from    = $message->header('From');
227
228If you've already got some other kind of email object, consider passing
229it to L<Email::Abstract> and then using its cast method to get an
230L<Email::MIME> object:
231
232  my $abstract = Email::Abstract->new($mail_message_object);
233  my $email_mime_object = $abstract->cast('Email::MIME');
234
235=head2 How do I check a valid mail address?
236
237(partly contributed by Aaron Sherman)
238
239This isn't as simple a question as it sounds. There are two parts:
240
241a) How do I verify that an email address is correctly formatted?
242
243b) How do I verify that an email address targets a valid recipient?
244
245Without sending mail to the address and seeing whether there's a human
246on the other end to answer you, you cannot fully answer part I<b>, but
247the L<Email::Valid> module will do both part I<a> and part I<b> as far
248as you can in real-time.
249
250Our best advice for verifying a person's mail address is to have them
251enter their address twice, just as you normally do to change a
252password. This usually weeds out typos. If both versions match, send
253mail to that address with a personal message. If you get the message
254back and they've followed your directions, you can be reasonably
255assured that it's real.
256
257A related strategy that's less open to forgery is to give them a PIN
258(personal ID number). Record the address and PIN (best that it be a
259random one) for later processing. In the mail you send, include a link to
260your site with the PIN included. If the mail bounces, you know it's not
261valid. If they don't click on the link, either they forged the address or
262(assuming they got the message) following through wasn't important so you
263don't need to worry about it.
264
265=head2 How do I decode a MIME/BASE64 string?
266
267The L<MIME::Base64> package handles this as well as the MIME/QP encoding.
268Decoding base 64 becomes as simple as:
269
270    use MIME::Base64;
271    my $decoded = decode_base64($encoded);
272
273The L<Email::MIME> module can decode base 64-encoded email message parts
274transparently so the developer doesn't need to worry about it.
275
276=head2 How do I find the user's mail address?
277
278Ask them for it. There are so many email providers available that it's
279unlikely the local system has any idea how to determine a user's email address.
280
281The exception is for organization-specific email (e.g. foo@yourcompany.com)
282where policy can be codified in your program. In that case, you could look at
283$ENV{USER}, $ENV{LOGNAME}, and getpwuid($<) in scalar context, like so:
284
285  my $user_name = getpwuid($<)
286
287But you still cannot make assumptions about whether this is correct, unless
288your policy says it is. You really are best off asking the user.
289
290=head2 How do I send email?
291
292Use the L<Email::Stuffer> module, like so:
293
294  # first, create your message
295  my $message = Email::Stuffer->from('you@example.com')
296                              ->to('friend@example.com')
297                              ->subject('Happy birthday!')
298                              ->text_body("Happy birthday to you!\n");
299
300  $message->send_or_die;
301
302By default, L<Email::Sender::Simple> (the C<send> and C<send_or_die> methods
303use this under the hood) will try C<sendmail> first, if it exists
304in your $PATH. This generally isn't the case. If there's a remote mail
305server you use to send mail, consider investigating one of the Transport
306classes. At time of writing, the available transports include:
307
308=over 4
309
310=item L<Email::Sender::Transport::Sendmail>
311
312This is the default. If you can use the L<mail(1)> or L<mailx(1)>
313program to send mail from the machine where your code runs, you should
314be able to use this.
315
316=item L<Email::Sender::Transport::SMTP>
317
318This transport contacts a remote SMTP server over TCP. It optionally
319uses TLS or SSL and can authenticate to the server via SASL.
320
321=back
322
323Telling L<Email::Stuffer> to use your transport is straightforward.
324
325  $message->transport($email_sender_transport_object)->send_or_die;
326
327=head2 How do I use MIME to make an attachment to a mail message?
328
329L<Email::MIME> directly supports multipart messages. L<Email::MIME>
330objects themselves are parts and can be attached to other L<Email::MIME>
331objects. Consult the L<Email::MIME> documentation for more information,
332including all of the supported methods and examples of their use.
333
334L<Email::Stuffer> uses L<Email::MIME> under the hood to construct
335messages, and wraps the most common attachment tasks with the simple
336C<attach> and C<attach_file> methods.
337
338  Email::Stuffer->to('friend@example.com')
339                ->subject('The file')
340                ->attach_file('stuff.csv')
341                ->send_or_die;
342
343=head2 How do I read email?
344
345Use the L<Email::Folder> module, like so:
346
347  use Email::Folder;
348
349  my $folder = Email::Folder->new('/path/to/email/folder');
350  while(my $message = $folder->next_message) {
351    # next_message returns Email::Simple objects, but we want
352    # Email::MIME objects as they're more robust
353    my $mime = Email::MIME->new($message->as_string);
354  }
355
356There are different classes in the L<Email::Folder> namespace for
357supporting various mailbox types. Note that these modules are generally
358rather limited and only support B<reading> rather than writing.
359
360=head2 How do I find out my hostname, domainname, or IP address?
361X<hostname, domainname, IP address, host, domain, hostfqdn, inet_ntoa,
362gethostbyname, Socket, Net::Domain, Sys::Hostname>
363
364(contributed by brian d foy)
365
366The L<Net::Domain> module, which is part of the Standard Library starting
367in Perl 5.7.3, can get you the fully qualified domain name (FQDN), the host
368name, or the domain name.
369
370    use Net::Domain qw(hostname hostfqdn hostdomain);
371
372    my $host = hostfqdn();
373
374The L<Sys::Hostname> module, part of the Standard Library, can also get the
375hostname:
376
377    use Sys::Hostname;
378
379    $host = hostname();
380
381
382The L<Sys::Hostname::Long> module takes a different approach and tries
383harder to return the fully qualified hostname:
384
385  use Sys::Hostname::Long 'hostname_long';
386
387  my $hostname = hostname_long();
388
389To get the IP address, you can use the C<gethostbyname> built-in function
390to turn the name into a number. To turn that number into the dotted octet
391form (a.b.c.d) that most people expect, use the C<inet_ntoa> function
392from the L<Socket> module, which also comes with perl.
393
394    use Socket;
395
396    my $address = inet_ntoa(
397        scalar gethostbyname( $host || 'localhost' )
398    );
399
400=head2 How do I fetch/put an (S)FTP file?
401
402L<Net::FTP>, and L<Net::SFTP> allow you to interact with FTP and SFTP (Secure
403FTP) servers.
404
405=head2 How can I do RPC in Perl?
406
407Use one of the RPC modules( L<https://metacpan.org/search?q=RPC> ).
408
409=head1 AUTHOR AND COPYRIGHT
410
411Copyright (c) 1997-2010 Tom Christiansen, Nathan Torkington, and
412other authors as noted. All rights reserved.
413
414This documentation is free; you can redistribute it and/or modify it
415under the same terms as Perl itself.
416
417Irrespective of its distribution, all code examples in this file
418are hereby placed into the public domain. You are permitted and
419encouraged to use this code in your own programs for fun
420or for profit as you see fit. A simple comment in the code giving
421credit would be courteous but is not required.
422