• Home
  • History
  • Annotate
Name Date Size #Lines LOC

..03-May-2022-

examples/H08-Jul-2014-6447

lib/POE/H03-May-2022-3,4991,588

t/H08-Jul-2014-3,9672,983

CHANGESH A D08-Jul-20143.8 KiB10674

CHANGES.OLDH A D08-Jul-20143.4 KiB11886

LICENSEH A D08-Jul-201417.9 KiB380292

MANIFESTH A D08-Jul-2014974 4847

MANIFEST.SKIPH A D08-Jul-2014208 2625

META.ymlH A D08-Jul-2014901 3231

Makefile.PLH A D08-Jul-20141.7 KiB7254

READMEH A D08-Jul-201419.6 KiB506385

README.mkdnH A D08-Jul-201419.4 KiB547388

dist.iniH A D08-Jul-20141.3 KiB5040

README

1NAME
2    POE::Component::Client::HTTP - a HTTP user-agent component
3
4VERSION
5    version 0.949
6
7SYNOPSIS
8      use POE qw(Component::Client::HTTP);
9
10      POE::Component::Client::HTTP->spawn(
11        Agent     => 'SpiffCrawler/0.90',   # defaults to something long
12        Alias     => 'ua',                  # defaults to 'weeble'
13        From      => 'spiffster@perl.org',  # defaults to undef (no header)
14        Protocol  => 'HTTP/0.9',            # defaults to 'HTTP/1.1'
15        Timeout   => 60,                    # defaults to 180 seconds
16        MaxSize   => 16384,                 # defaults to entire response
17        Streaming => 4096,                  # defaults to 0 (off)
18        FollowRedirects => 2,               # defaults to 0 (off)
19        Proxy     => "http://localhost:80", # defaults to HTTP_PROXY env. variable
20        NoProxy   => [ "localhost", "127.0.0.1" ], # defs to NO_PROXY env. variable
21        BindAddr  => "12.34.56.78",         # defaults to INADDR_ANY
22      );
23
24      $kernel->post(
25        'ua',        # posts to the 'ua' alias
26        'request',   # posts to ua's 'request' state
27        'response',  # which of our states will receive the response
28        $request,    # an HTTP::Request object
29      );
30
31      # This is the sub which is called when the session receives a
32      # 'response' event.
33      sub response_handler {
34        my ($request_packet, $response_packet) = @_[ARG0, ARG1];
35
36        # HTTP::Request
37        my $request_object  = $request_packet->[0];
38
39        # HTTP::Response
40        my $response_object = $response_packet->[0];
41
42        my $stream_chunk;
43        if (! defined($response_object->content)) {
44          $stream_chunk = $response_packet->[1];
45        }
46
47        print(
48          "*" x 78, "\n",
49          "*** my request:\n",
50          "-" x 78, "\n",
51          $request_object->as_string(),
52          "*" x 78, "\n",
53          "*** their response:\n",
54          "-" x 78, "\n",
55          $response_object->as_string(),
56        );
57
58        if (defined $stream_chunk) {
59          print "-" x 40, "\n", $stream_chunk, "\n";
60        }
61
62        print "*" x 78, "\n";
63      }
64
65DESCRIPTION
66    POE::Component::Client::HTTP is an HTTP user-agent for POE. It lets
67    other sessions run while HTTP transactions are being processed, and it
68    lets several HTTP transactions be processed in parallel.
69
70    It supports keep-alive through POE::Component::Client::Keepalive, which
71    in turn uses POE::Component::Resolver for asynchronous IPv4 and IPv6
72    name resolution.
73
74    HTTP client components are not proper objects. Instead of being created,
75    as most objects are, they are "spawned" as separate sessions. To avoid
76    confusion (and hopefully not cause other confusion), they must be
77    spawned with a "spawn" method, not created anew with a "new" one.
78
79CONSTRUCTOR
80  spawn
81    PoCo::Client::HTTP's "spawn" method takes a few named parameters:
82
83    Agent => $user_agent_string
84    Agent => \@list_of_agents
85      If a UserAgent header is not present in the HTTP::Request, a random
86      one will be used from those specified by the "Agent" parameter. If
87      none are supplied, POE::Component::Client::HTTP will advertise itself
88      to the server.
89
90      "Agent" may contain a reference to a list of user agents. If this is
91      the case, PoCo::Client::HTTP will choose one of them at random for
92      each request.
93
94    Alias => $session_alias
95      "Alias" sets the name by which the session will be known. If no alias
96      is given, the component defaults to "weeble". The alias lets several
97      sessions interact with HTTP components without keeping (or even
98      knowing) hard references to them. It's possible to spawn several HTTP
99      components with different names.
100
101    ConnectionManager => $poco_client_keepalive
102      "ConnectionManager" sets this component's connection pool manager. It
103      expects the connection manager to be a reference to a
104      POE::Component::Client::Keepalive object. The HTTP client component
105      will call "allocate()" on the connection manager itself so you should
106      not have done this already.
107
108        my $pool = POE::Component::Client::Keepalive->new(
109          keep_alive    => 10, # seconds to keep connections alive
110          max_open      => 100, # max concurrent connections - total
111          max_per_host  => 20, # max concurrent connections - per host
112          timeout       => 30, # max time (seconds) to establish a new connection
113        );
114
115        POE::Component::Client::HTTP->spawn(
116          # ...
117          ConnectionManager => $pool,
118          # ...
119        );
120
121      See POE::Component::Client::Keepalive for more information, including
122      how to alter the connection manager's resolver configuration (for
123      example, to force IPv6 or prefer it before IPv4).
124
125    CookieJar => $cookie_jar
126      "CookieJar" sets the component's cookie jar. It expects the cookie jar
127      to be a reference to a HTTP::Cookies object.
128
129    From => $admin_address
130      "From" holds an e-mail address where the client's administrator and/or
131      maintainer may be reached. It defaults to undef, which means no From
132      header will be included in requests.
133
134    MaxSize => OCTETS
135      "MaxSize" specifies the largest response to accept from a server. The
136      content of larger responses will be truncated to OCTET octets. This
137      has been used to return the <head></head> section of web pages without
138      the need to wade through <body></body>.
139
140    NoProxy => [ $host_1, $host_2, ..., $host_N ]
141    NoProxy => "host1,host2,hostN"
142      "NoProxy" specifies a list of server hosts that will not be proxied.
143      It is useful for local hosts and hosts that do not properly support
144      proxying. If NoProxy is not specified, a list will be taken from the
145      NO_PROXY environment variable.
146
147        NoProxy => [ "localhost", "127.0.0.1" ],
148        NoProxy => "localhost,127.0.0.1",
149
150    BindAddr => $local_ip
151      Specify "BindAddr" to bind all client sockets to a particular local
152      address. The value of BindAddr will be passed through
153      POE::Component::Client::Keepalive to POE::Wheel::SocketFactory (as
154      "bind_address"). See that module's documentation for implementation
155      details.
156
157        BindAddr => "12.34.56.78"
158
159    Protocol => $http_protocol_string
160      "Protocol" advertises the protocol that the client wishes to see.
161      Under normal circumstances, it should be left to its default value:
162      "HTTP/1.1".
163
164    Proxy => [ $proxy_host, $proxy_port ]
165    Proxy => $proxy_url
166    Proxy => $proxy_url,$proxy_url,...
167      "Proxy" specifies one or more proxy hosts that requests will be passed
168      through. If not specified, proxy servers will be taken from the
169      HTTP_PROXY (or http_proxy) environment variable. No proxying will
170      occur unless Proxy is set or one of the environment variables exists.
171
172      The proxy can be specified either as a host and port, or as one or
173      more URLs. Proxy URLs must specify the proxy port, even if it is 80.
174
175        Proxy => [ "127.0.0.1", 80 ],
176        Proxy => "http://127.0.0.1:80/",
177
178      "Proxy" may specify multiple proxies separated by commas.
179      PoCo::Client::HTTP will choose proxies from this list at random. This
180      is useful for load balancing requests through multiple gateways.
181
182        Proxy => "http://127.0.0.1:80/,http://127.0.0.1:81/",
183
184    Streaming => OCTETS
185      "Streaming" changes allows Client::HTTP to return large content in
186      chunks (of OCTETS octets each) rather than combine the entire content
187      into a single HTTP::Response object.
188
189      By default, Client::HTTP reads the entire content for a response into
190      memory before returning an HTTP::Response object. This is obviously
191      bad for applications like streaming MP3 clients, because they often
192      fetch songs that never end. Yes, they go on and on, my friend.
193
194      When "Streaming" is set to nonzero, however, the response handler
195      receives chunks of up to OCTETS octets apiece. The response handler
196      accepts slightly different parameters in this case. ARG0 is also an
197      HTTP::Response object but it does not contain response content, and
198      ARG1 contains a a chunk of raw response content, or undef if the
199      stream has ended.
200
201        sub streaming_response_handler {
202          my $response_packet = $_[ARG1];
203          my ($response, $data) = @$response_packet;
204          print SAVED_STREAM $data if defined $data;
205        }
206
207    FollowRedirects => $number_of_hops_to_follow
208      "FollowRedirects" specifies how many redirects (e.g. 302 Moved) to
209      follow. If not specified defaults to 0, and thus no redirection is
210      followed. This maintains compatibility with the previous behavior,
211      which was not to follow redirects at all.
212
213      If redirects are followed, a response chain should be built, and can
214      be accessed through $response_object->previous(). See HTTP::Response
215      for details here.
216
217    Timeout => $query_timeout
218      "Timeout" sets how long POE::Component::Client::HTTP has to process an
219      application's request, in seconds. "Timeout" defaults to 180 (three
220      minutes) if not specified.
221
222      It's important to note that the timeout begins when the component
223      receives an application's request, not when it attempts to connect to
224      the web server.
225
226      Timeouts may result from sending the component too many requests at
227      once. Each request would need to be received and tracked in order.
228      Consider this:
229
230        $_[KERNEL]->post(component => request => ...) for (1..15_000);
231
232      15,000 requests are queued together in one enormous bolus. The
233      component would receive and initialize them in order. The first socket
234      activity wouldn't arrive until the 15,000th request was set up. If
235      that took longer than "Timeout", then the requests that have waited
236      too long would fail.
237
238      "ConnectionManager"'s own timeout and concurrency limits also affect
239      how many requests may be processed at once. For example, most of the
240      15,000 requests would wait in the connection manager's pool until
241      sockets become available. Meanwhile, the "Timeout" would be counting
242      down.
243
244      Applications may elect to control concurrency outside the component's
245      "Timeout". They may do so in a few ways.
246
247      The easiest way is to limit the initial number of requests to
248      something more manageable. As responses arrive, the application should
249      handle them and start new requests. This limits concurrency to the
250      initial request count.
251
252      An application may also outsource job throttling to another module,
253      such as POE::Component::JobQueue.
254
255      In any case, "Timeout" and "ConnectionManager" may be tuned to
256      maximize timeouts and concurrency limits. This may help in some cases.
257      Developers should be aware that doing so will increase memory usage.
258      POE::Component::Client::HTTP and KeepAlive track requests in memory,
259      while applications are free to keep pending requests on disk.
260
261ACCEPTED EVENTS
262    Sessions communicate asynchronously with PoCo::Client::HTTP. They post
263    requests to it, and it posts responses back.
264
265  request
266    Requests are posted to the component's "request" state. They include an
267    HTTP::Request object which defines the request. For example:
268
269      $kernel->post(
270        'ua', 'request',            # http session alias & state
271        'response',                 # my state to receive responses
272        GET('http://poe.perl.org'), # a simple HTTP request
273        'unique id',                # a tag to identify the request
274        'progress',                 # an event to indicate progress
275        'http://1.2.3.4:80/'        # proxy to use for this request
276      );
277
278    Requests include the state to which responses will be posted. In the
279    previous example, the handler for a 'response' state will be called with
280    each HTTP response. The "progress" handler is optional and if installed,
281    the component will provide progress metrics (see sample handler below).
282    The "proxy" parameter is optional and if not defined, a default proxy
283    will be used if configured. No proxy will be used if neither a default
284    one nor a "proxy" parameter is defined.
285
286  pending_requests_count
287    There's also a pending_requests_count state that returns the number of
288    requests currently being processed. To receive the return value, it must
289    be invoked with $kernel->call().
290
291      my $count = $kernel->call('ua' => 'pending_requests_count');
292
293    NOTE: Sometimes the count might not be what you expected, because
294    responses are currently in POE's queue and you haven't processed them.
295    This could happen if you configure the "ConnectionManager"'s concurrency
296    to a high enough value.
297
298  cancel
299    Cancel a specific HTTP request. Requires a reference to the original
300    request (blessed or stringified) so it knows which one to cancel. See
301    "progress handler" below for notes on canceling streaming requests.
302
303    To cancel a request based on its blessed HTTP::Request object:
304
305      $kernel->post( component => cancel => $http_request );
306
307    To cancel a request based on its stringified HTTP::Request object:
308
309      $kernel->post( component => cancel => "$http_request" );
310
311  shutdown
312    Responds to all pending requests with 408 (request timeout), and then
313    shuts down the component and all subcomponents.
314
315SENT EVENTS
316  response handler
317    In addition to all the usual POE parameters, HTTP responses come with
318    two list references:
319
320      my ($request_packet, $response_packet) = @_[ARG0, ARG1];
321
322    $request_packet contains a reference to the original HTTP::Request
323    object. This is useful for matching responses back to the requests that
324    generated them.
325
326      my $http_request_object = $request_packet->[0];
327      my $http_request_tag    = $request_packet->[1]; # from the 'request' post
328
329    $response_packet contains a reference to the resulting HTTP::Response
330    object.
331
332      my $http_response_object = $response_packet->[0];
333
334    Please see the HTTP::Request and HTTP::Response manpages for more
335    information.
336
337  progress handler
338    The example progress handler shows how to calculate a percentage of
339    download completion.
340
341      sub progress_handler {
342        my $gen_args  = $_[ARG0];    # args passed to all calls
343        my $call_args = $_[ARG1];    # args specific to the call
344
345        my $req = $gen_args->[0];    # HTTP::Request object being serviced
346        my $tag = $gen_args->[1];    # Request ID tag from.
347        my $got = $call_args->[0];   # Number of bytes retrieved so far.
348        my $tot = $call_args->[1];   # Total bytes to be retrieved.
349        my $oct = $call_args->[2];   # Chunk of raw octets received this time.
350
351        my $percent = $got / $tot * 100;
352
353        printf(
354          "-- %.0f%% [%d/%d]: %s\n", $percent, $got, $tot, $req->uri()
355        );
356
357        # To cancel the request:
358        # $_[KERNEL]->post( component => cancel => $req );
359      }
360
361   DEPRECATION WARNING
362    The third return argument (the raw octets received) has been deprecated.
363    Instead of it, use the Streaming parameter to get chunks of content in
364    the response handler.
365
366REQUEST CALLBACKS
367    The HTTP::Request object passed to the request event can contain a CODE
368    reference as "content". This allows for sending large files without
369    wasting memory. Your callback should return a chunk of data each time it
370    is called, and an empty string when done. Don't forget to set the
371    Content-Length header correctly. Example:
372
373      my $request = HTTP::Request->new( PUT => 'http://...' );
374
375      my $file = '/path/to/large_file';
376
377      open my $fh, '<', $file;
378
379      my $upload_cb = sub {
380        if ( sysread $fh, my $buf, 4096 ) {
381          return $buf;
382        }
383        else {
384          close $fh;
385          return '';
386        }
387      };
388
389      $request->content_length( -s $file );
390
391      $request->content( $upload_cb );
392
393      $kernel->post( ua => request, 'response', $request );
394
395CONTENT ENCODING AND COMPRESSION
396    Transparent content decoding has been disabled as of version 0.84. This
397    also removes support for transparent gzip requesting and decompression.
398
399    To re-enable gzip compression, specify the gzip Content-Encoding and use
400    HTTP::Response's decoded_content() method rather than content():
401
402      my $request = HTTP::Request->new(
403        GET => "http://www.yahoo.com/", [
404          'Accept-Encoding' => 'gzip'
405        ]
406      );
407
408      # ... time passes ...
409
410      my $content = $response->decoded_content();
411
412    The change in POE::Component::Client::HTTP behavior was prompted by
413    changes in HTTP::Response that surfaced a bug in the component's
414    transparent gzip handling.
415
416    Allowing the application to specify and handle content encodings seems
417    to be the most reliable and flexible resolution.
418
419    For more information about the problem and discussions regarding the
420    solution, see: <http://www.perlmonks.org/?node_id=683833> and
421    <http://rt.cpan.org/Ticket/Display.html?id=35538>
422
423CLIENT HEADERS
424    POE::Component::Client::HTTP sets its own response headers with
425    additional information. All of its headers begin with "X-PCCH".
426
427  X-PCCH-Errmsg
428    POE::Component::Client::HTTP may fail because of an internal client
429    error rather than an HTTP protocol error. X-PCCH-Errmsg will contain a
430    human readable reason for client failures, should they occur.
431
432    The text of X-PCCH-Errmsg may also be repeated in the response's
433    content.
434
435  X-PCCH-Peer
436    X-PCCH-Peer contains the remote IPv4 address and port, separated by a
437    period. For example, "127.0.0.1.8675" represents port 8675 on localhost.
438
439    Proxying will render X-PCCH-Peer nearly useless, since the socket will
440    be connected to a proxy rather than the server itself.
441
442    This feature was added at Doreen Grey's request. Doreen wanted a means
443    to find the remote server's address without having to make an additional
444    request.
445
446ENVIRONMENT
447    POE::Component::Client::HTTP uses two standard environment variables:
448    HTTP_PROXY and NO_PROXY.
449
450    HTTP_PROXY sets the proxy server that Client::HTTP will forward requests
451    through. NO_PROXY sets a list of hosts that will not be forwarded
452    through a proxy.
453
454    See the Proxy and NoProxy constructor parameters for more information
455    about these variables.
456
457SEE ALSO
458    This component is built upon HTTP::Request, HTTP::Response, and POE.
459    Please see its source code and the documentation for its foundation
460    modules to learn more. If you want to use cookies, you'll need to read
461    about HTTP::Cookies as well.
462
463    Also see the test program, t/01_request.t, in the PoCo::Client::HTTP
464    distribution.
465
466BUGS
467    There is no support for CGI_PROXY or CgiProxy.
468
469    Secure HTTP (https) proxying is not supported at this time.
470
471    There is no object oriented interface. See
472    POE::Component::Client::Keepalive and POE::Component::Resolver for
473    examples of a decent OO interface.
474
475AUTHOR, COPYRIGHT, & LICENSE
476    POE::Component::Client::HTTP is
477
478    * Copyright 1999-2009 Rocco Caputo
479
480    * Copyright 2004 Rob Bloodgood
481
482    * Copyright 2004-2005 Martijn van Beers
483
484    All rights are reserved. POE::Component::Client::HTTP is free software;
485    you may redistribute it and/or modify it under the same terms as Perl
486    itself.
487
488CONTRIBUTORS
489    Joel Bernstein solved some nasty race conditions. Portugal Telecom
490    <http://www.sapo.pt/> was kind enough to support his contributions.
491
492    Jeff Bisbee added POD tests and documentation to pass several of them to
493    version 0.79. He's a kwalitee-increasing machine!
494
495BUG TRACKER
496    https://rt.cpan.org/Dist/Display.html?Queue=POE-Component-Client-HTTP
497
498REPOSITORY
499    Github: <http://github.com/rcaputo/poe-component-client-http> .
500
501    Gitorious: <http://gitorious.org/poe-component-client-http> .
502
503OTHER RESOURCES
504    <http://search.cpan.org/dist/POE-Component-Client-HTTP/>
505
506

README.mkdn

1# NAME
2
3POE::Component::Client::HTTP - a HTTP user-agent component
4
5# VERSION
6
7version 0.949
8
9# SYNOPSIS
10
11    use POE qw(Component::Client::HTTP);
12
13    POE::Component::Client::HTTP->spawn(
14      Agent     => 'SpiffCrawler/0.90',   # defaults to something long
15      Alias     => 'ua',                  # defaults to 'weeble'
16      From      => 'spiffster@perl.org',  # defaults to undef (no header)
17      Protocol  => 'HTTP/0.9',            # defaults to 'HTTP/1.1'
18      Timeout   => 60,                    # defaults to 180 seconds
19      MaxSize   => 16384,                 # defaults to entire response
20      Streaming => 4096,                  # defaults to 0 (off)
21      FollowRedirects => 2,               # defaults to 0 (off)
22      Proxy     => "http://localhost:80", # defaults to HTTP_PROXY env. variable
23      NoProxy   => [ "localhost", "127.0.0.1" ], # defs to NO_PROXY env. variable
24      BindAddr  => "12.34.56.78",         # defaults to INADDR_ANY
25    );
26
27    $kernel->post(
28      'ua',        # posts to the 'ua' alias
29      'request',   # posts to ua's 'request' state
30      'response',  # which of our states will receive the response
31      $request,    # an HTTP::Request object
32    );
33
34    # This is the sub which is called when the session receives a
35    # 'response' event.
36    sub response_handler {
37      my ($request_packet, $response_packet) = @_[ARG0, ARG1];
38
39      # HTTP::Request
40      my $request_object  = $request_packet->[0];
41
42      # HTTP::Response
43      my $response_object = $response_packet->[0];
44
45      my $stream_chunk;
46      if (! defined($response_object->content)) {
47        $stream_chunk = $response_packet->[1];
48      }
49
50      print(
51        "*" x 78, "\n",
52        "*** my request:\n",
53        "-" x 78, "\n",
54        $request_object->as_string(),
55        "*" x 78, "\n",
56        "*** their response:\n",
57        "-" x 78, "\n",
58        $response_object->as_string(),
59      );
60
61      if (defined $stream_chunk) {
62        print "-" x 40, "\n", $stream_chunk, "\n";
63      }
64
65      print "*" x 78, "\n";
66    }
67
68# DESCRIPTION
69
70POE::Component::Client::HTTP is an HTTP user-agent for POE.  It lets
71other sessions run while HTTP transactions are being processed, and it
72lets several HTTP transactions be processed in parallel.
73
74It supports keep-alive through POE::Component::Client::Keepalive,
75which in turn uses POE::Component::Resolver for asynchronous IPv4 and
76IPv6 name resolution.
77
78HTTP client components are not proper objects.  Instead of being
79created, as most objects are, they are "spawned" as separate sessions.
80To avoid confusion (and hopefully not cause other confusion), they
81must be spawned with a `spawn` method, not created anew with a `new`
82one.
83
84# CONSTRUCTOR
85
86## spawn
87
88PoCo::Client::HTTP's `spawn` method takes a few named parameters:
89
90- Agent => $user\_agent\_string
91- Agent => \\@list\_of\_agents
92
93    If a UserAgent header is not present in the HTTP::Request, a random
94    one will be used from those specified by the `Agent` parameter.  If
95    none are supplied, POE::Component::Client::HTTP will advertise itself
96    to the server.
97
98    `Agent` may contain a reference to a list of user agents.  If this is
99    the case, PoCo::Client::HTTP will choose one of them at random for
100    each request.
101
102- Alias => $session\_alias
103
104    `Alias` sets the name by which the session will be known.  If no
105    alias is given, the component defaults to "weeble".  The alias lets
106    several sessions interact with HTTP components without keeping (or
107    even knowing) hard references to them.  It's possible to spawn several
108    HTTP components with different names.
109
110- ConnectionManager => $poco\_client\_keepalive
111
112    `ConnectionManager` sets this component's connection pool manager.
113    It expects the connection manager to be a reference to a
114    POE::Component::Client::Keepalive object.  The HTTP client component
115    will call `allocate()` on the connection manager itself so you should
116    not have done this already.
117
118        my $pool = POE::Component::Client::Keepalive->new(
119          keep_alive    => 10, # seconds to keep connections alive
120          max_open      => 100, # max concurrent connections - total
121          max_per_host  => 20, # max concurrent connections - per host
122          timeout       => 30, # max time (seconds) to establish a new connection
123        );
124
125        POE::Component::Client::HTTP->spawn(
126          # ...
127          ConnectionManager => $pool,
128          # ...
129        );
130
131    See [POE::Component::Client::Keepalive](https://metacpan.org/pod/POE::Component::Client::Keepalive) for more information,
132    including how to alter the connection manager's resolver
133    configuration (for example, to force IPv6 or prefer it before IPv4).
134
135- CookieJar => $cookie\_jar
136
137    `CookieJar` sets the component's cookie jar.  It expects the cookie
138    jar to be a reference to a HTTP::Cookies object.
139
140- From => $admin\_address
141
142    `From` holds an e-mail address where the client's administrator
143    and/or maintainer may be reached.  It defaults to undef, which means
144    no From header will be included in requests.
145
146- MaxSize => OCTETS
147
148    `MaxSize` specifies the largest response to accept from a server.
149    The content of larger responses will be truncated to OCTET octets.
150    This has been used to return the <head></head> section of web pages
151    without the need to wade through <body></body>.
152
153- NoProxy => \[ $host\_1, $host\_2, ..., $host\_N \]
154- NoProxy => "host1,host2,hostN"
155
156    `NoProxy` specifies a list of server hosts that will not be proxied.
157    It is useful for local hosts and hosts that do not properly support
158    proxying.  If NoProxy is not specified, a list will be taken from the
159    NO\_PROXY environment variable.
160
161        NoProxy => [ "localhost", "127.0.0.1" ],
162        NoProxy => "localhost,127.0.0.1",
163
164- BindAddr => $local\_ip
165
166    Specify `BindAddr` to bind all client sockets to a particular local
167    address.  The value of BindAddr will be passed through
168    POE::Component::Client::Keepalive to POE::Wheel::SocketFactory (as
169    `bind_address`).  See that module's documentation for implementation
170    details.
171
172        BindAddr => "12.34.56.78"
173
174- Protocol => $http\_protocol\_string
175
176    `Protocol` advertises the protocol that the client wishes to see.
177    Under normal circumstances, it should be left to its default value:
178    "HTTP/1.1".
179
180- Proxy => \[ $proxy\_host, $proxy\_port \]
181- Proxy => $proxy\_url
182- Proxy => $proxy\_url,$proxy\_url,...
183
184    `Proxy` specifies one or more proxy hosts that requests will be
185    passed through.  If not specified, proxy servers will be taken from
186    the HTTP\_PROXY (or http\_proxy) environment variable.  No proxying will
187    occur unless Proxy is set or one of the environment variables exists.
188
189    The proxy can be specified either as a host and port, or as one or
190    more URLs.  Proxy URLs must specify the proxy port, even if it is 80.
191
192        Proxy => [ "127.0.0.1", 80 ],
193        Proxy => "http://127.0.0.1:80/",
194
195    `Proxy` may specify multiple proxies separated by commas.
196    PoCo::Client::HTTP will choose proxies from this list at random.  This
197    is useful for load balancing requests through multiple gateways.
198
199        Proxy => "http://127.0.0.1:80/,http://127.0.0.1:81/",
200
201- Streaming => OCTETS
202
203    `Streaming` changes allows Client::HTTP to return large content in
204    chunks (of OCTETS octets each) rather than combine the entire content
205    into a single HTTP::Response object.
206
207    By default, Client::HTTP reads the entire content for a response into
208    memory before returning an HTTP::Response object.  This is obviously
209    bad for applications like streaming MP3 clients, because they often
210    fetch songs that never end.  Yes, they go on and on, my friend.
211
212    When `Streaming` is set to nonzero, however, the response handler
213    receives chunks of up to OCTETS octets apiece.  The response handler
214    accepts slightly different parameters in this case.  ARG0 is also an
215    HTTP::Response object but it does not contain response content,
216    and ARG1 contains a a chunk of raw response
217    content, or undef if the stream has ended.
218
219        sub streaming_response_handler {
220          my $response_packet = $_[ARG1];
221          my ($response, $data) = @$response_packet;
222          print SAVED_STREAM $data if defined $data;
223        }
224
225- FollowRedirects => $number\_of\_hops\_to\_follow
226
227    `FollowRedirects` specifies how many redirects (e.g. 302 Moved) to
228    follow.  If not specified defaults to 0, and thus no redirection is
229    followed.  This maintains compatibility with the previous behavior,
230    which was not to follow redirects at all.
231
232    If redirects are followed, a response chain should be built, and can
233    be accessed through $response\_object->previous(). See HTTP::Response
234    for details here.
235
236- Timeout => $query\_timeout
237
238    `Timeout` sets how long POE::Component::Client::HTTP has to process
239    an application's request, in seconds.  `Timeout` defaults to 180
240    (three minutes) if not specified.
241
242    It's important to note that the timeout begins when the component
243    receives an application's request, not when it attempts to connect to
244    the web server.
245
246    Timeouts may result from sending the component too many requests at
247    once.  Each request would need to be received and tracked in order.
248    Consider this:
249
250        $_[KERNEL]->post(component => request => ...) for (1..15_000);
251
252    15,000 requests are queued together in one enormous bolus.  The
253    component would receive and initialize them in order.  The first
254    socket activity wouldn't arrive until the 15,000th request was set up.
255    If that took longer than `Timeout`, then the requests that have
256    waited too long would fail.
257
258    `ConnectionManager`'s own timeout and concurrency limits also affect
259    how many requests may be processed at once.  For example, most of the
260    15,000 requests would wait in the connection manager's pool until
261    sockets become available.  Meanwhile, the `Timeout` would be counting
262    down.
263
264    Applications may elect to control concurrency outside the component's
265    `Timeout`.  They may do so in a few ways.
266
267    The easiest way is to limit the initial number of requests to
268    something more manageable.  As responses arrive, the application
269    should handle them and start new requests.  This limits concurrency to
270    the initial request count.
271
272    An application may also outsource job throttling to another module,
273    such as POE::Component::JobQueue.
274
275    In any case, `Timeout` and `ConnectionManager` may be tuned to
276    maximize timeouts and concurrency limits.  This may help in some
277    cases.  Developers should be aware that doing so will increase memory
278    usage.  POE::Component::Client::HTTP and KeepAlive track requests in
279    memory, while applications are free to keep pending requests on disk.
280
281# ACCEPTED EVENTS
282
283Sessions communicate asynchronously with PoCo::Client::HTTP.  They
284post requests to it, and it posts responses back.
285
286## request
287
288Requests are posted to the component's "request" state.  They include
289an HTTP::Request object which defines the request.  For example:
290
291    $kernel->post(
292      'ua', 'request',            # http session alias & state
293      'response',                 # my state to receive responses
294      GET('http://poe.perl.org'), # a simple HTTP request
295      'unique id',                # a tag to identify the request
296      'progress',                 # an event to indicate progress
297      'http://1.2.3.4:80/'        # proxy to use for this request
298    );
299
300Requests include the state to which responses will be posted.  In the
301previous example, the handler for a 'response' state will be called
302with each HTTP response.  The "progress" handler is optional and if
303installed, the component will provide progress metrics (see sample
304handler below).  The "proxy" parameter is optional and if not defined,
305a default proxy will be used if configured.  No proxy will be used if
306neither a default one nor a "proxy" parameter is defined.
307
308## pending\_requests\_count
309
310There's also a pending\_requests\_count state that returns the number of
311requests currently being processed.  To receive the return value, it
312must be invoked with $kernel->call().
313
314    my $count = $kernel->call('ua' => 'pending_requests_count');
315
316NOTE: Sometimes the count might not be what you expected, because responses
317are currently in POE's queue and you haven't processed them. This could happen
318if you configure the `ConnectionManager`'s concurrency to a high enough value.
319
320## cancel
321
322Cancel a specific HTTP request.  Requires a reference to the original
323request (blessed or stringified) so it knows which one to cancel.  See
324["progress handler"](#progress-handler) below for notes on canceling streaming requests.
325
326To cancel a request based on its blessed HTTP::Request object:
327
328    $kernel->post( component => cancel => $http_request );
329
330To cancel a request based on its stringified HTTP::Request object:
331
332    $kernel->post( component => cancel => "$http_request" );
333
334## shutdown
335
336Responds to all pending requests with 408 (request timeout), and then
337shuts down the component and all subcomponents.
338
339# SENT EVENTS
340
341## response handler
342
343In addition to all the usual POE parameters, HTTP responses come with
344two list references:
345
346    my ($request_packet, $response_packet) = @_[ARG0, ARG1];
347
348`$request_packet` contains a reference to the original HTTP::Request
349object.  This is useful for matching responses back to the requests
350that generated them.
351
352    my $http_request_object = $request_packet->[0];
353    my $http_request_tag    = $request_packet->[1]; # from the 'request' post
354
355`$response_packet` contains a reference to the resulting
356HTTP::Response object.
357
358    my $http_response_object = $response_packet->[0];
359
360Please see the HTTP::Request and HTTP::Response manpages for more
361information.
362
363## progress handler
364
365The example progress handler shows how to calculate a percentage of
366download completion.
367
368    sub progress_handler {
369      my $gen_args  = $_[ARG0];    # args passed to all calls
370      my $call_args = $_[ARG1];    # args specific to the call
371
372      my $req = $gen_args->[0];    # HTTP::Request object being serviced
373      my $tag = $gen_args->[1];    # Request ID tag from.
374      my $got = $call_args->[0];   # Number of bytes retrieved so far.
375      my $tot = $call_args->[1];   # Total bytes to be retrieved.
376      my $oct = $call_args->[2];   # Chunk of raw octets received this time.
377
378      my $percent = $got / $tot * 100;
379
380      printf(
381        "-- %.0f%% [%d/%d]: %s\n", $percent, $got, $tot, $req->uri()
382      );
383
384      # To cancel the request:
385      # $_[KERNEL]->post( component => cancel => $req );
386    }
387
388### DEPRECATION WARNING
389
390The third return argument (the raw octets received) has been deprecated.
391Instead of it, use the Streaming parameter to get chunks of content
392in the response handler.
393
394# REQUEST CALLBACKS
395
396The HTTP::Request object passed to the request event can contain a
397CODE reference as `content`.  This allows for sending large files
398without wasting memory.  Your callback should return a chunk of data
399each time it is called, and an empty string when done.  Don't forget
400to set the Content-Length header correctly.  Example:
401
402    my $request = HTTP::Request->new( PUT => 'http://...' );
403
404    my $file = '/path/to/large_file';
405
406    open my $fh, '<', $file;
407
408    my $upload_cb = sub {
409      if ( sysread $fh, my $buf, 4096 ) {
410        return $buf;
411      }
412      else {
413        close $fh;
414        return '';
415      }
416    };
417
418    $request->content_length( -s $file );
419
420    $request->content( $upload_cb );
421
422    $kernel->post( ua => request, 'response', $request );
423
424# CONTENT ENCODING AND COMPRESSION
425
426Transparent content decoding has been disabled as of version 0.84.
427This also removes support for transparent gzip requesting and
428decompression.
429
430To re-enable gzip compression, specify the gzip Content-Encoding and
431use HTTP::Response's decoded\_content() method rather than content():
432
433    my $request = HTTP::Request->new(
434      GET => "http://www.yahoo.com/", [
435        'Accept-Encoding' => 'gzip'
436      ]
437    );
438
439    # ... time passes ...
440
441    my $content = $response->decoded_content();
442
443The change in POE::Component::Client::HTTP behavior was prompted by
444changes in HTTP::Response that surfaced a bug in the component's
445transparent gzip handling.
446
447Allowing the application to specify and handle content encodings seems
448to be the most reliable and flexible resolution.
449
450For more information about the problem and discussions regarding the
451solution, see:
452[http://www.perlmonks.org/?node\_id=683833](http://www.perlmonks.org/?node_id=683833) and
453[http://rt.cpan.org/Ticket/Display.html?id=35538](http://rt.cpan.org/Ticket/Display.html?id=35538)
454
455# CLIENT HEADERS
456
457POE::Component::Client::HTTP sets its own response headers with
458additional information.  All of its headers begin with "X-PCCH".
459
460## X-PCCH-Errmsg
461
462POE::Component::Client::HTTP may fail because of an internal client
463error rather than an HTTP protocol error.  X-PCCH-Errmsg will contain a
464human readable reason for client failures, should they occur.
465
466The text of X-PCCH-Errmsg may also be repeated in the response's
467content.
468
469## X-PCCH-Peer
470
471X-PCCH-Peer contains the remote IPv4 address and port, separated by a
472period.  For example, "127.0.0.1.8675" represents port 8675 on
473localhost.
474
475Proxying will render X-PCCH-Peer nearly useless, since the socket will
476be connected to a proxy rather than the server itself.
477
478This feature was added at Doreen Grey's request.  Doreen wanted a
479means to find the remote server's address without having to make an
480additional request.
481
482# ENVIRONMENT
483
484POE::Component::Client::HTTP uses two standard environment variables:
485HTTP\_PROXY and NO\_PROXY.
486
487HTTP\_PROXY sets the proxy server that Client::HTTP will forward
488requests through.  NO\_PROXY sets a list of hosts that will not be
489forwarded through a proxy.
490
491See the Proxy and NoProxy constructor parameters for more information
492about these variables.
493
494# SEE ALSO
495
496This component is built upon HTTP::Request, HTTP::Response, and POE.
497Please see its source code and the documentation for its foundation
498modules to learn more.  If you want to use cookies, you'll need to
499read about HTTP::Cookies as well.
500
501Also see the test program, t/01\_request.t, in the PoCo::Client::HTTP
502distribution.
503
504# BUGS
505
506There is no support for CGI\_PROXY or CgiProxy.
507
508Secure HTTP (https) proxying is not supported at this time.
509
510There is no object oriented interface.  See
511[POE::Component::Client::Keepalive](https://metacpan.org/pod/POE::Component::Client::Keepalive) and
512[POE::Component::Resolver](https://metacpan.org/pod/POE::Component::Resolver) for examples of a decent OO interface.
513
514# AUTHOR, COPYRIGHT, & LICENSE
515
516POE::Component::Client::HTTP is
517
518- Copyright 1999-2009 Rocco Caputo
519- Copyright 2004 Rob Bloodgood
520- Copyright 2004-2005 Martijn van Beers
521
522All rights are reserved.  POE::Component::Client::HTTP is free
523software; you may redistribute it and/or modify it under the same
524terms as Perl itself.
525
526# CONTRIBUTORS
527
528Joel Bernstein solved some nasty race conditions.  Portugal Telecom
529[http://www.sapo.pt/](http://www.sapo.pt/) was kind enough to support his contributions.
530
531Jeff Bisbee added POD tests and documentation to pass several of them
532to version 0.79.  He's a kwalitee-increasing machine!
533
534# BUG TRACKER
535
536https://rt.cpan.org/Dist/Display.html?Queue=POE-Component-Client-HTTP
537
538# REPOSITORY
539
540Github: [http://github.com/rcaputo/poe-component-client-http](http://github.com/rcaputo/poe-component-client-http) .
541
542Gitorious: [http://gitorious.org/poe-component-client-http](http://gitorious.org/poe-component-client-http) .
543
544# OTHER RESOURCES
545
546[http://search.cpan.org/dist/POE-Component-Client-HTTP/](http://search.cpan.org/dist/POE-Component-Client-HTTP/)
547