README
1NAME
2 POE::Component::Client::HTTP - a HTTP user-agent component
3
4VERSION
5 version 0.949
6
7SYNOPSIS
8 use POE qw(Component::Client::HTTP);
9
10 POE::Component::Client::HTTP->spawn(
11 Agent => 'SpiffCrawler/0.90', # defaults to something long
12 Alias => 'ua', # defaults to 'weeble'
13 From => 'spiffster@perl.org', # defaults to undef (no header)
14 Protocol => 'HTTP/0.9', # defaults to 'HTTP/1.1'
15 Timeout => 60, # defaults to 180 seconds
16 MaxSize => 16384, # defaults to entire response
17 Streaming => 4096, # defaults to 0 (off)
18 FollowRedirects => 2, # defaults to 0 (off)
19 Proxy => "http://localhost:80", # defaults to HTTP_PROXY env. variable
20 NoProxy => [ "localhost", "127.0.0.1" ], # defs to NO_PROXY env. variable
21 BindAddr => "12.34.56.78", # defaults to INADDR_ANY
22 );
23
24 $kernel->post(
25 'ua', # posts to the 'ua' alias
26 'request', # posts to ua's 'request' state
27 'response', # which of our states will receive the response
28 $request, # an HTTP::Request object
29 );
30
31 # This is the sub which is called when the session receives a
32 # 'response' event.
33 sub response_handler {
34 my ($request_packet, $response_packet) = @_[ARG0, ARG1];
35
36 # HTTP::Request
37 my $request_object = $request_packet->[0];
38
39 # HTTP::Response
40 my $response_object = $response_packet->[0];
41
42 my $stream_chunk;
43 if (! defined($response_object->content)) {
44 $stream_chunk = $response_packet->[1];
45 }
46
47 print(
48 "*" x 78, "\n",
49 "*** my request:\n",
50 "-" x 78, "\n",
51 $request_object->as_string(),
52 "*" x 78, "\n",
53 "*** their response:\n",
54 "-" x 78, "\n",
55 $response_object->as_string(),
56 );
57
58 if (defined $stream_chunk) {
59 print "-" x 40, "\n", $stream_chunk, "\n";
60 }
61
62 print "*" x 78, "\n";
63 }
64
65DESCRIPTION
66 POE::Component::Client::HTTP is an HTTP user-agent for POE. It lets
67 other sessions run while HTTP transactions are being processed, and it
68 lets several HTTP transactions be processed in parallel.
69
70 It supports keep-alive through POE::Component::Client::Keepalive, which
71 in turn uses POE::Component::Resolver for asynchronous IPv4 and IPv6
72 name resolution.
73
74 HTTP client components are not proper objects. Instead of being created,
75 as most objects are, they are "spawned" as separate sessions. To avoid
76 confusion (and hopefully not cause other confusion), they must be
77 spawned with a "spawn" method, not created anew with a "new" one.
78
79CONSTRUCTOR
80 spawn
81 PoCo::Client::HTTP's "spawn" method takes a few named parameters:
82
83 Agent => $user_agent_string
84 Agent => \@list_of_agents
85 If a UserAgent header is not present in the HTTP::Request, a random
86 one will be used from those specified by the "Agent" parameter. If
87 none are supplied, POE::Component::Client::HTTP will advertise itself
88 to the server.
89
90 "Agent" may contain a reference to a list of user agents. If this is
91 the case, PoCo::Client::HTTP will choose one of them at random for
92 each request.
93
94 Alias => $session_alias
95 "Alias" sets the name by which the session will be known. If no alias
96 is given, the component defaults to "weeble". The alias lets several
97 sessions interact with HTTP components without keeping (or even
98 knowing) hard references to them. It's possible to spawn several HTTP
99 components with different names.
100
101 ConnectionManager => $poco_client_keepalive
102 "ConnectionManager" sets this component's connection pool manager. It
103 expects the connection manager to be a reference to a
104 POE::Component::Client::Keepalive object. The HTTP client component
105 will call "allocate()" on the connection manager itself so you should
106 not have done this already.
107
108 my $pool = POE::Component::Client::Keepalive->new(
109 keep_alive => 10, # seconds to keep connections alive
110 max_open => 100, # max concurrent connections - total
111 max_per_host => 20, # max concurrent connections - per host
112 timeout => 30, # max time (seconds) to establish a new connection
113 );
114
115 POE::Component::Client::HTTP->spawn(
116 # ...
117 ConnectionManager => $pool,
118 # ...
119 );
120
121 See POE::Component::Client::Keepalive for more information, including
122 how to alter the connection manager's resolver configuration (for
123 example, to force IPv6 or prefer it before IPv4).
124
125 CookieJar => $cookie_jar
126 "CookieJar" sets the component's cookie jar. It expects the cookie jar
127 to be a reference to a HTTP::Cookies object.
128
129 From => $admin_address
130 "From" holds an e-mail address where the client's administrator and/or
131 maintainer may be reached. It defaults to undef, which means no From
132 header will be included in requests.
133
134 MaxSize => OCTETS
135 "MaxSize" specifies the largest response to accept from a server. The
136 content of larger responses will be truncated to OCTET octets. This
137 has been used to return the <head></head> section of web pages without
138 the need to wade through <body></body>.
139
140 NoProxy => [ $host_1, $host_2, ..., $host_N ]
141 NoProxy => "host1,host2,hostN"
142 "NoProxy" specifies a list of server hosts that will not be proxied.
143 It is useful for local hosts and hosts that do not properly support
144 proxying. If NoProxy is not specified, a list will be taken from the
145 NO_PROXY environment variable.
146
147 NoProxy => [ "localhost", "127.0.0.1" ],
148 NoProxy => "localhost,127.0.0.1",
149
150 BindAddr => $local_ip
151 Specify "BindAddr" to bind all client sockets to a particular local
152 address. The value of BindAddr will be passed through
153 POE::Component::Client::Keepalive to POE::Wheel::SocketFactory (as
154 "bind_address"). See that module's documentation for implementation
155 details.
156
157 BindAddr => "12.34.56.78"
158
159 Protocol => $http_protocol_string
160 "Protocol" advertises the protocol that the client wishes to see.
161 Under normal circumstances, it should be left to its default value:
162 "HTTP/1.1".
163
164 Proxy => [ $proxy_host, $proxy_port ]
165 Proxy => $proxy_url
166 Proxy => $proxy_url,$proxy_url,...
167 "Proxy" specifies one or more proxy hosts that requests will be passed
168 through. If not specified, proxy servers will be taken from the
169 HTTP_PROXY (or http_proxy) environment variable. No proxying will
170 occur unless Proxy is set or one of the environment variables exists.
171
172 The proxy can be specified either as a host and port, or as one or
173 more URLs. Proxy URLs must specify the proxy port, even if it is 80.
174
175 Proxy => [ "127.0.0.1", 80 ],
176 Proxy => "http://127.0.0.1:80/",
177
178 "Proxy" may specify multiple proxies separated by commas.
179 PoCo::Client::HTTP will choose proxies from this list at random. This
180 is useful for load balancing requests through multiple gateways.
181
182 Proxy => "http://127.0.0.1:80/,http://127.0.0.1:81/",
183
184 Streaming => OCTETS
185 "Streaming" changes allows Client::HTTP to return large content in
186 chunks (of OCTETS octets each) rather than combine the entire content
187 into a single HTTP::Response object.
188
189 By default, Client::HTTP reads the entire content for a response into
190 memory before returning an HTTP::Response object. This is obviously
191 bad for applications like streaming MP3 clients, because they often
192 fetch songs that never end. Yes, they go on and on, my friend.
193
194 When "Streaming" is set to nonzero, however, the response handler
195 receives chunks of up to OCTETS octets apiece. The response handler
196 accepts slightly different parameters in this case. ARG0 is also an
197 HTTP::Response object but it does not contain response content, and
198 ARG1 contains a a chunk of raw response content, or undef if the
199 stream has ended.
200
201 sub streaming_response_handler {
202 my $response_packet = $_[ARG1];
203 my ($response, $data) = @$response_packet;
204 print SAVED_STREAM $data if defined $data;
205 }
206
207 FollowRedirects => $number_of_hops_to_follow
208 "FollowRedirects" specifies how many redirects (e.g. 302 Moved) to
209 follow. If not specified defaults to 0, and thus no redirection is
210 followed. This maintains compatibility with the previous behavior,
211 which was not to follow redirects at all.
212
213 If redirects are followed, a response chain should be built, and can
214 be accessed through $response_object->previous(). See HTTP::Response
215 for details here.
216
217 Timeout => $query_timeout
218 "Timeout" sets how long POE::Component::Client::HTTP has to process an
219 application's request, in seconds. "Timeout" defaults to 180 (three
220 minutes) if not specified.
221
222 It's important to note that the timeout begins when the component
223 receives an application's request, not when it attempts to connect to
224 the web server.
225
226 Timeouts may result from sending the component too many requests at
227 once. Each request would need to be received and tracked in order.
228 Consider this:
229
230 $_[KERNEL]->post(component => request => ...) for (1..15_000);
231
232 15,000 requests are queued together in one enormous bolus. The
233 component would receive and initialize them in order. The first socket
234 activity wouldn't arrive until the 15,000th request was set up. If
235 that took longer than "Timeout", then the requests that have waited
236 too long would fail.
237
238 "ConnectionManager"'s own timeout and concurrency limits also affect
239 how many requests may be processed at once. For example, most of the
240 15,000 requests would wait in the connection manager's pool until
241 sockets become available. Meanwhile, the "Timeout" would be counting
242 down.
243
244 Applications may elect to control concurrency outside the component's
245 "Timeout". They may do so in a few ways.
246
247 The easiest way is to limit the initial number of requests to
248 something more manageable. As responses arrive, the application should
249 handle them and start new requests. This limits concurrency to the
250 initial request count.
251
252 An application may also outsource job throttling to another module,
253 such as POE::Component::JobQueue.
254
255 In any case, "Timeout" and "ConnectionManager" may be tuned to
256 maximize timeouts and concurrency limits. This may help in some cases.
257 Developers should be aware that doing so will increase memory usage.
258 POE::Component::Client::HTTP and KeepAlive track requests in memory,
259 while applications are free to keep pending requests on disk.
260
261ACCEPTED EVENTS
262 Sessions communicate asynchronously with PoCo::Client::HTTP. They post
263 requests to it, and it posts responses back.
264
265 request
266 Requests are posted to the component's "request" state. They include an
267 HTTP::Request object which defines the request. For example:
268
269 $kernel->post(
270 'ua', 'request', # http session alias & state
271 'response', # my state to receive responses
272 GET('http://poe.perl.org'), # a simple HTTP request
273 'unique id', # a tag to identify the request
274 'progress', # an event to indicate progress
275 'http://1.2.3.4:80/' # proxy to use for this request
276 );
277
278 Requests include the state to which responses will be posted. In the
279 previous example, the handler for a 'response' state will be called with
280 each HTTP response. The "progress" handler is optional and if installed,
281 the component will provide progress metrics (see sample handler below).
282 The "proxy" parameter is optional and if not defined, a default proxy
283 will be used if configured. No proxy will be used if neither a default
284 one nor a "proxy" parameter is defined.
285
286 pending_requests_count
287 There's also a pending_requests_count state that returns the number of
288 requests currently being processed. To receive the return value, it must
289 be invoked with $kernel->call().
290
291 my $count = $kernel->call('ua' => 'pending_requests_count');
292
293 NOTE: Sometimes the count might not be what you expected, because
294 responses are currently in POE's queue and you haven't processed them.
295 This could happen if you configure the "ConnectionManager"'s concurrency
296 to a high enough value.
297
298 cancel
299 Cancel a specific HTTP request. Requires a reference to the original
300 request (blessed or stringified) so it knows which one to cancel. See
301 "progress handler" below for notes on canceling streaming requests.
302
303 To cancel a request based on its blessed HTTP::Request object:
304
305 $kernel->post( component => cancel => $http_request );
306
307 To cancel a request based on its stringified HTTP::Request object:
308
309 $kernel->post( component => cancel => "$http_request" );
310
311 shutdown
312 Responds to all pending requests with 408 (request timeout), and then
313 shuts down the component and all subcomponents.
314
315SENT EVENTS
316 response handler
317 In addition to all the usual POE parameters, HTTP responses come with
318 two list references:
319
320 my ($request_packet, $response_packet) = @_[ARG0, ARG1];
321
322 $request_packet contains a reference to the original HTTP::Request
323 object. This is useful for matching responses back to the requests that
324 generated them.
325
326 my $http_request_object = $request_packet->[0];
327 my $http_request_tag = $request_packet->[1]; # from the 'request' post
328
329 $response_packet contains a reference to the resulting HTTP::Response
330 object.
331
332 my $http_response_object = $response_packet->[0];
333
334 Please see the HTTP::Request and HTTP::Response manpages for more
335 information.
336
337 progress handler
338 The example progress handler shows how to calculate a percentage of
339 download completion.
340
341 sub progress_handler {
342 my $gen_args = $_[ARG0]; # args passed to all calls
343 my $call_args = $_[ARG1]; # args specific to the call
344
345 my $req = $gen_args->[0]; # HTTP::Request object being serviced
346 my $tag = $gen_args->[1]; # Request ID tag from.
347 my $got = $call_args->[0]; # Number of bytes retrieved so far.
348 my $tot = $call_args->[1]; # Total bytes to be retrieved.
349 my $oct = $call_args->[2]; # Chunk of raw octets received this time.
350
351 my $percent = $got / $tot * 100;
352
353 printf(
354 "-- %.0f%% [%d/%d]: %s\n", $percent, $got, $tot, $req->uri()
355 );
356
357 # To cancel the request:
358 # $_[KERNEL]->post( component => cancel => $req );
359 }
360
361 DEPRECATION WARNING
362 The third return argument (the raw octets received) has been deprecated.
363 Instead of it, use the Streaming parameter to get chunks of content in
364 the response handler.
365
366REQUEST CALLBACKS
367 The HTTP::Request object passed to the request event can contain a CODE
368 reference as "content". This allows for sending large files without
369 wasting memory. Your callback should return a chunk of data each time it
370 is called, and an empty string when done. Don't forget to set the
371 Content-Length header correctly. Example:
372
373 my $request = HTTP::Request->new( PUT => 'http://...' );
374
375 my $file = '/path/to/large_file';
376
377 open my $fh, '<', $file;
378
379 my $upload_cb = sub {
380 if ( sysread $fh, my $buf, 4096 ) {
381 return $buf;
382 }
383 else {
384 close $fh;
385 return '';
386 }
387 };
388
389 $request->content_length( -s $file );
390
391 $request->content( $upload_cb );
392
393 $kernel->post( ua => request, 'response', $request );
394
395CONTENT ENCODING AND COMPRESSION
396 Transparent content decoding has been disabled as of version 0.84. This
397 also removes support for transparent gzip requesting and decompression.
398
399 To re-enable gzip compression, specify the gzip Content-Encoding and use
400 HTTP::Response's decoded_content() method rather than content():
401
402 my $request = HTTP::Request->new(
403 GET => "http://www.yahoo.com/", [
404 'Accept-Encoding' => 'gzip'
405 ]
406 );
407
408 # ... time passes ...
409
410 my $content = $response->decoded_content();
411
412 The change in POE::Component::Client::HTTP behavior was prompted by
413 changes in HTTP::Response that surfaced a bug in the component's
414 transparent gzip handling.
415
416 Allowing the application to specify and handle content encodings seems
417 to be the most reliable and flexible resolution.
418
419 For more information about the problem and discussions regarding the
420 solution, see: <http://www.perlmonks.org/?node_id=683833> and
421 <http://rt.cpan.org/Ticket/Display.html?id=35538>
422
423CLIENT HEADERS
424 POE::Component::Client::HTTP sets its own response headers with
425 additional information. All of its headers begin with "X-PCCH".
426
427 X-PCCH-Errmsg
428 POE::Component::Client::HTTP may fail because of an internal client
429 error rather than an HTTP protocol error. X-PCCH-Errmsg will contain a
430 human readable reason for client failures, should they occur.
431
432 The text of X-PCCH-Errmsg may also be repeated in the response's
433 content.
434
435 X-PCCH-Peer
436 X-PCCH-Peer contains the remote IPv4 address and port, separated by a
437 period. For example, "127.0.0.1.8675" represents port 8675 on localhost.
438
439 Proxying will render X-PCCH-Peer nearly useless, since the socket will
440 be connected to a proxy rather than the server itself.
441
442 This feature was added at Doreen Grey's request. Doreen wanted a means
443 to find the remote server's address without having to make an additional
444 request.
445
446ENVIRONMENT
447 POE::Component::Client::HTTP uses two standard environment variables:
448 HTTP_PROXY and NO_PROXY.
449
450 HTTP_PROXY sets the proxy server that Client::HTTP will forward requests
451 through. NO_PROXY sets a list of hosts that will not be forwarded
452 through a proxy.
453
454 See the Proxy and NoProxy constructor parameters for more information
455 about these variables.
456
457SEE ALSO
458 This component is built upon HTTP::Request, HTTP::Response, and POE.
459 Please see its source code and the documentation for its foundation
460 modules to learn more. If you want to use cookies, you'll need to read
461 about HTTP::Cookies as well.
462
463 Also see the test program, t/01_request.t, in the PoCo::Client::HTTP
464 distribution.
465
466BUGS
467 There is no support for CGI_PROXY or CgiProxy.
468
469 Secure HTTP (https) proxying is not supported at this time.
470
471 There is no object oriented interface. See
472 POE::Component::Client::Keepalive and POE::Component::Resolver for
473 examples of a decent OO interface.
474
475AUTHOR, COPYRIGHT, & LICENSE
476 POE::Component::Client::HTTP is
477
478 * Copyright 1999-2009 Rocco Caputo
479
480 * Copyright 2004 Rob Bloodgood
481
482 * Copyright 2004-2005 Martijn van Beers
483
484 All rights are reserved. POE::Component::Client::HTTP is free software;
485 you may redistribute it and/or modify it under the same terms as Perl
486 itself.
487
488CONTRIBUTORS
489 Joel Bernstein solved some nasty race conditions. Portugal Telecom
490 <http://www.sapo.pt/> was kind enough to support his contributions.
491
492 Jeff Bisbee added POD tests and documentation to pass several of them to
493 version 0.79. He's a kwalitee-increasing machine!
494
495BUG TRACKER
496 https://rt.cpan.org/Dist/Display.html?Queue=POE-Component-Client-HTTP
497
498REPOSITORY
499 Github: <http://github.com/rcaputo/poe-component-client-http> .
500
501 Gitorious: <http://gitorious.org/poe-component-client-http> .
502
503OTHER RESOURCES
504 <http://search.cpan.org/dist/POE-Component-Client-HTTP/>
505
506
README.mkdn
1# NAME
2
3POE::Component::Client::HTTP - a HTTP user-agent component
4
5# VERSION
6
7version 0.949
8
9# SYNOPSIS
10
11 use POE qw(Component::Client::HTTP);
12
13 POE::Component::Client::HTTP->spawn(
14 Agent => 'SpiffCrawler/0.90', # defaults to something long
15 Alias => 'ua', # defaults to 'weeble'
16 From => 'spiffster@perl.org', # defaults to undef (no header)
17 Protocol => 'HTTP/0.9', # defaults to 'HTTP/1.1'
18 Timeout => 60, # defaults to 180 seconds
19 MaxSize => 16384, # defaults to entire response
20 Streaming => 4096, # defaults to 0 (off)
21 FollowRedirects => 2, # defaults to 0 (off)
22 Proxy => "http://localhost:80", # defaults to HTTP_PROXY env. variable
23 NoProxy => [ "localhost", "127.0.0.1" ], # defs to NO_PROXY env. variable
24 BindAddr => "12.34.56.78", # defaults to INADDR_ANY
25 );
26
27 $kernel->post(
28 'ua', # posts to the 'ua' alias
29 'request', # posts to ua's 'request' state
30 'response', # which of our states will receive the response
31 $request, # an HTTP::Request object
32 );
33
34 # This is the sub which is called when the session receives a
35 # 'response' event.
36 sub response_handler {
37 my ($request_packet, $response_packet) = @_[ARG0, ARG1];
38
39 # HTTP::Request
40 my $request_object = $request_packet->[0];
41
42 # HTTP::Response
43 my $response_object = $response_packet->[0];
44
45 my $stream_chunk;
46 if (! defined($response_object->content)) {
47 $stream_chunk = $response_packet->[1];
48 }
49
50 print(
51 "*" x 78, "\n",
52 "*** my request:\n",
53 "-" x 78, "\n",
54 $request_object->as_string(),
55 "*" x 78, "\n",
56 "*** their response:\n",
57 "-" x 78, "\n",
58 $response_object->as_string(),
59 );
60
61 if (defined $stream_chunk) {
62 print "-" x 40, "\n", $stream_chunk, "\n";
63 }
64
65 print "*" x 78, "\n";
66 }
67
68# DESCRIPTION
69
70POE::Component::Client::HTTP is an HTTP user-agent for POE. It lets
71other sessions run while HTTP transactions are being processed, and it
72lets several HTTP transactions be processed in parallel.
73
74It supports keep-alive through POE::Component::Client::Keepalive,
75which in turn uses POE::Component::Resolver for asynchronous IPv4 and
76IPv6 name resolution.
77
78HTTP client components are not proper objects. Instead of being
79created, as most objects are, they are "spawned" as separate sessions.
80To avoid confusion (and hopefully not cause other confusion), they
81must be spawned with a `spawn` method, not created anew with a `new`
82one.
83
84# CONSTRUCTOR
85
86## spawn
87
88PoCo::Client::HTTP's `spawn` method takes a few named parameters:
89
90- Agent => $user\_agent\_string
91- Agent => \\@list\_of\_agents
92
93 If a UserAgent header is not present in the HTTP::Request, a random
94 one will be used from those specified by the `Agent` parameter. If
95 none are supplied, POE::Component::Client::HTTP will advertise itself
96 to the server.
97
98 `Agent` may contain a reference to a list of user agents. If this is
99 the case, PoCo::Client::HTTP will choose one of them at random for
100 each request.
101
102- Alias => $session\_alias
103
104 `Alias` sets the name by which the session will be known. If no
105 alias is given, the component defaults to "weeble". The alias lets
106 several sessions interact with HTTP components without keeping (or
107 even knowing) hard references to them. It's possible to spawn several
108 HTTP components with different names.
109
110- ConnectionManager => $poco\_client\_keepalive
111
112 `ConnectionManager` sets this component's connection pool manager.
113 It expects the connection manager to be a reference to a
114 POE::Component::Client::Keepalive object. The HTTP client component
115 will call `allocate()` on the connection manager itself so you should
116 not have done this already.
117
118 my $pool = POE::Component::Client::Keepalive->new(
119 keep_alive => 10, # seconds to keep connections alive
120 max_open => 100, # max concurrent connections - total
121 max_per_host => 20, # max concurrent connections - per host
122 timeout => 30, # max time (seconds) to establish a new connection
123 );
124
125 POE::Component::Client::HTTP->spawn(
126 # ...
127 ConnectionManager => $pool,
128 # ...
129 );
130
131 See [POE::Component::Client::Keepalive](https://metacpan.org/pod/POE::Component::Client::Keepalive) for more information,
132 including how to alter the connection manager's resolver
133 configuration (for example, to force IPv6 or prefer it before IPv4).
134
135- CookieJar => $cookie\_jar
136
137 `CookieJar` sets the component's cookie jar. It expects the cookie
138 jar to be a reference to a HTTP::Cookies object.
139
140- From => $admin\_address
141
142 `From` holds an e-mail address where the client's administrator
143 and/or maintainer may be reached. It defaults to undef, which means
144 no From header will be included in requests.
145
146- MaxSize => OCTETS
147
148 `MaxSize` specifies the largest response to accept from a server.
149 The content of larger responses will be truncated to OCTET octets.
150 This has been used to return the <head></head> section of web pages
151 without the need to wade through <body></body>.
152
153- NoProxy => \[ $host\_1, $host\_2, ..., $host\_N \]
154- NoProxy => "host1,host2,hostN"
155
156 `NoProxy` specifies a list of server hosts that will not be proxied.
157 It is useful for local hosts and hosts that do not properly support
158 proxying. If NoProxy is not specified, a list will be taken from the
159 NO\_PROXY environment variable.
160
161 NoProxy => [ "localhost", "127.0.0.1" ],
162 NoProxy => "localhost,127.0.0.1",
163
164- BindAddr => $local\_ip
165
166 Specify `BindAddr` to bind all client sockets to a particular local
167 address. The value of BindAddr will be passed through
168 POE::Component::Client::Keepalive to POE::Wheel::SocketFactory (as
169 `bind_address`). See that module's documentation for implementation
170 details.
171
172 BindAddr => "12.34.56.78"
173
174- Protocol => $http\_protocol\_string
175
176 `Protocol` advertises the protocol that the client wishes to see.
177 Under normal circumstances, it should be left to its default value:
178 "HTTP/1.1".
179
180- Proxy => \[ $proxy\_host, $proxy\_port \]
181- Proxy => $proxy\_url
182- Proxy => $proxy\_url,$proxy\_url,...
183
184 `Proxy` specifies one or more proxy hosts that requests will be
185 passed through. If not specified, proxy servers will be taken from
186 the HTTP\_PROXY (or http\_proxy) environment variable. No proxying will
187 occur unless Proxy is set or one of the environment variables exists.
188
189 The proxy can be specified either as a host and port, or as one or
190 more URLs. Proxy URLs must specify the proxy port, even if it is 80.
191
192 Proxy => [ "127.0.0.1", 80 ],
193 Proxy => "http://127.0.0.1:80/",
194
195 `Proxy` may specify multiple proxies separated by commas.
196 PoCo::Client::HTTP will choose proxies from this list at random. This
197 is useful for load balancing requests through multiple gateways.
198
199 Proxy => "http://127.0.0.1:80/,http://127.0.0.1:81/",
200
201- Streaming => OCTETS
202
203 `Streaming` changes allows Client::HTTP to return large content in
204 chunks (of OCTETS octets each) rather than combine the entire content
205 into a single HTTP::Response object.
206
207 By default, Client::HTTP reads the entire content for a response into
208 memory before returning an HTTP::Response object. This is obviously
209 bad for applications like streaming MP3 clients, because they often
210 fetch songs that never end. Yes, they go on and on, my friend.
211
212 When `Streaming` is set to nonzero, however, the response handler
213 receives chunks of up to OCTETS octets apiece. The response handler
214 accepts slightly different parameters in this case. ARG0 is also an
215 HTTP::Response object but it does not contain response content,
216 and ARG1 contains a a chunk of raw response
217 content, or undef if the stream has ended.
218
219 sub streaming_response_handler {
220 my $response_packet = $_[ARG1];
221 my ($response, $data) = @$response_packet;
222 print SAVED_STREAM $data if defined $data;
223 }
224
225- FollowRedirects => $number\_of\_hops\_to\_follow
226
227 `FollowRedirects` specifies how many redirects (e.g. 302 Moved) to
228 follow. If not specified defaults to 0, and thus no redirection is
229 followed. This maintains compatibility with the previous behavior,
230 which was not to follow redirects at all.
231
232 If redirects are followed, a response chain should be built, and can
233 be accessed through $response\_object->previous(). See HTTP::Response
234 for details here.
235
236- Timeout => $query\_timeout
237
238 `Timeout` sets how long POE::Component::Client::HTTP has to process
239 an application's request, in seconds. `Timeout` defaults to 180
240 (three minutes) if not specified.
241
242 It's important to note that the timeout begins when the component
243 receives an application's request, not when it attempts to connect to
244 the web server.
245
246 Timeouts may result from sending the component too many requests at
247 once. Each request would need to be received and tracked in order.
248 Consider this:
249
250 $_[KERNEL]->post(component => request => ...) for (1..15_000);
251
252 15,000 requests are queued together in one enormous bolus. The
253 component would receive and initialize them in order. The first
254 socket activity wouldn't arrive until the 15,000th request was set up.
255 If that took longer than `Timeout`, then the requests that have
256 waited too long would fail.
257
258 `ConnectionManager`'s own timeout and concurrency limits also affect
259 how many requests may be processed at once. For example, most of the
260 15,000 requests would wait in the connection manager's pool until
261 sockets become available. Meanwhile, the `Timeout` would be counting
262 down.
263
264 Applications may elect to control concurrency outside the component's
265 `Timeout`. They may do so in a few ways.
266
267 The easiest way is to limit the initial number of requests to
268 something more manageable. As responses arrive, the application
269 should handle them and start new requests. This limits concurrency to
270 the initial request count.
271
272 An application may also outsource job throttling to another module,
273 such as POE::Component::JobQueue.
274
275 In any case, `Timeout` and `ConnectionManager` may be tuned to
276 maximize timeouts and concurrency limits. This may help in some
277 cases. Developers should be aware that doing so will increase memory
278 usage. POE::Component::Client::HTTP and KeepAlive track requests in
279 memory, while applications are free to keep pending requests on disk.
280
281# ACCEPTED EVENTS
282
283Sessions communicate asynchronously with PoCo::Client::HTTP. They
284post requests to it, and it posts responses back.
285
286## request
287
288Requests are posted to the component's "request" state. They include
289an HTTP::Request object which defines the request. For example:
290
291 $kernel->post(
292 'ua', 'request', # http session alias & state
293 'response', # my state to receive responses
294 GET('http://poe.perl.org'), # a simple HTTP request
295 'unique id', # a tag to identify the request
296 'progress', # an event to indicate progress
297 'http://1.2.3.4:80/' # proxy to use for this request
298 );
299
300Requests include the state to which responses will be posted. In the
301previous example, the handler for a 'response' state will be called
302with each HTTP response. The "progress" handler is optional and if
303installed, the component will provide progress metrics (see sample
304handler below). The "proxy" parameter is optional and if not defined,
305a default proxy will be used if configured. No proxy will be used if
306neither a default one nor a "proxy" parameter is defined.
307
308## pending\_requests\_count
309
310There's also a pending\_requests\_count state that returns the number of
311requests currently being processed. To receive the return value, it
312must be invoked with $kernel->call().
313
314 my $count = $kernel->call('ua' => 'pending_requests_count');
315
316NOTE: Sometimes the count might not be what you expected, because responses
317are currently in POE's queue and you haven't processed them. This could happen
318if you configure the `ConnectionManager`'s concurrency to a high enough value.
319
320## cancel
321
322Cancel a specific HTTP request. Requires a reference to the original
323request (blessed or stringified) so it knows which one to cancel. See
324["progress handler"](#progress-handler) below for notes on canceling streaming requests.
325
326To cancel a request based on its blessed HTTP::Request object:
327
328 $kernel->post( component => cancel => $http_request );
329
330To cancel a request based on its stringified HTTP::Request object:
331
332 $kernel->post( component => cancel => "$http_request" );
333
334## shutdown
335
336Responds to all pending requests with 408 (request timeout), and then
337shuts down the component and all subcomponents.
338
339# SENT EVENTS
340
341## response handler
342
343In addition to all the usual POE parameters, HTTP responses come with
344two list references:
345
346 my ($request_packet, $response_packet) = @_[ARG0, ARG1];
347
348`$request_packet` contains a reference to the original HTTP::Request
349object. This is useful for matching responses back to the requests
350that generated them.
351
352 my $http_request_object = $request_packet->[0];
353 my $http_request_tag = $request_packet->[1]; # from the 'request' post
354
355`$response_packet` contains a reference to the resulting
356HTTP::Response object.
357
358 my $http_response_object = $response_packet->[0];
359
360Please see the HTTP::Request and HTTP::Response manpages for more
361information.
362
363## progress handler
364
365The example progress handler shows how to calculate a percentage of
366download completion.
367
368 sub progress_handler {
369 my $gen_args = $_[ARG0]; # args passed to all calls
370 my $call_args = $_[ARG1]; # args specific to the call
371
372 my $req = $gen_args->[0]; # HTTP::Request object being serviced
373 my $tag = $gen_args->[1]; # Request ID tag from.
374 my $got = $call_args->[0]; # Number of bytes retrieved so far.
375 my $tot = $call_args->[1]; # Total bytes to be retrieved.
376 my $oct = $call_args->[2]; # Chunk of raw octets received this time.
377
378 my $percent = $got / $tot * 100;
379
380 printf(
381 "-- %.0f%% [%d/%d]: %s\n", $percent, $got, $tot, $req->uri()
382 );
383
384 # To cancel the request:
385 # $_[KERNEL]->post( component => cancel => $req );
386 }
387
388### DEPRECATION WARNING
389
390The third return argument (the raw octets received) has been deprecated.
391Instead of it, use the Streaming parameter to get chunks of content
392in the response handler.
393
394# REQUEST CALLBACKS
395
396The HTTP::Request object passed to the request event can contain a
397CODE reference as `content`. This allows for sending large files
398without wasting memory. Your callback should return a chunk of data
399each time it is called, and an empty string when done. Don't forget
400to set the Content-Length header correctly. Example:
401
402 my $request = HTTP::Request->new( PUT => 'http://...' );
403
404 my $file = '/path/to/large_file';
405
406 open my $fh, '<', $file;
407
408 my $upload_cb = sub {
409 if ( sysread $fh, my $buf, 4096 ) {
410 return $buf;
411 }
412 else {
413 close $fh;
414 return '';
415 }
416 };
417
418 $request->content_length( -s $file );
419
420 $request->content( $upload_cb );
421
422 $kernel->post( ua => request, 'response', $request );
423
424# CONTENT ENCODING AND COMPRESSION
425
426Transparent content decoding has been disabled as of version 0.84.
427This also removes support for transparent gzip requesting and
428decompression.
429
430To re-enable gzip compression, specify the gzip Content-Encoding and
431use HTTP::Response's decoded\_content() method rather than content():
432
433 my $request = HTTP::Request->new(
434 GET => "http://www.yahoo.com/", [
435 'Accept-Encoding' => 'gzip'
436 ]
437 );
438
439 # ... time passes ...
440
441 my $content = $response->decoded_content();
442
443The change in POE::Component::Client::HTTP behavior was prompted by
444changes in HTTP::Response that surfaced a bug in the component's
445transparent gzip handling.
446
447Allowing the application to specify and handle content encodings seems
448to be the most reliable and flexible resolution.
449
450For more information about the problem and discussions regarding the
451solution, see:
452[http://www.perlmonks.org/?node\_id=683833](http://www.perlmonks.org/?node_id=683833) and
453[http://rt.cpan.org/Ticket/Display.html?id=35538](http://rt.cpan.org/Ticket/Display.html?id=35538)
454
455# CLIENT HEADERS
456
457POE::Component::Client::HTTP sets its own response headers with
458additional information. All of its headers begin with "X-PCCH".
459
460## X-PCCH-Errmsg
461
462POE::Component::Client::HTTP may fail because of an internal client
463error rather than an HTTP protocol error. X-PCCH-Errmsg will contain a
464human readable reason for client failures, should they occur.
465
466The text of X-PCCH-Errmsg may also be repeated in the response's
467content.
468
469## X-PCCH-Peer
470
471X-PCCH-Peer contains the remote IPv4 address and port, separated by a
472period. For example, "127.0.0.1.8675" represents port 8675 on
473localhost.
474
475Proxying will render X-PCCH-Peer nearly useless, since the socket will
476be connected to a proxy rather than the server itself.
477
478This feature was added at Doreen Grey's request. Doreen wanted a
479means to find the remote server's address without having to make an
480additional request.
481
482# ENVIRONMENT
483
484POE::Component::Client::HTTP uses two standard environment variables:
485HTTP\_PROXY and NO\_PROXY.
486
487HTTP\_PROXY sets the proxy server that Client::HTTP will forward
488requests through. NO\_PROXY sets a list of hosts that will not be
489forwarded through a proxy.
490
491See the Proxy and NoProxy constructor parameters for more information
492about these variables.
493
494# SEE ALSO
495
496This component is built upon HTTP::Request, HTTP::Response, and POE.
497Please see its source code and the documentation for its foundation
498modules to learn more. If you want to use cookies, you'll need to
499read about HTTP::Cookies as well.
500
501Also see the test program, t/01\_request.t, in the PoCo::Client::HTTP
502distribution.
503
504# BUGS
505
506There is no support for CGI\_PROXY or CgiProxy.
507
508Secure HTTP (https) proxying is not supported at this time.
509
510There is no object oriented interface. See
511[POE::Component::Client::Keepalive](https://metacpan.org/pod/POE::Component::Client::Keepalive) and
512[POE::Component::Resolver](https://metacpan.org/pod/POE::Component::Resolver) for examples of a decent OO interface.
513
514# AUTHOR, COPYRIGHT, & LICENSE
515
516POE::Component::Client::HTTP is
517
518- Copyright 1999-2009 Rocco Caputo
519- Copyright 2004 Rob Bloodgood
520- Copyright 2004-2005 Martijn van Beers
521
522All rights are reserved. POE::Component::Client::HTTP is free
523software; you may redistribute it and/or modify it under the same
524terms as Perl itself.
525
526# CONTRIBUTORS
527
528Joel Bernstein solved some nasty race conditions. Portugal Telecom
529[http://www.sapo.pt/](http://www.sapo.pt/) was kind enough to support his contributions.
530
531Jeff Bisbee added POD tests and documentation to pass several of them
532to version 0.79. He's a kwalitee-increasing machine!
533
534# BUG TRACKER
535
536https://rt.cpan.org/Dist/Display.html?Queue=POE-Component-Client-HTTP
537
538# REPOSITORY
539
540Github: [http://github.com/rcaputo/poe-component-client-http](http://github.com/rcaputo/poe-component-client-http) .
541
542Gitorious: [http://gitorious.org/poe-component-client-http](http://gitorious.org/poe-component-client-http) .
543
544# OTHER RESOURCES
545
546[http://search.cpan.org/dist/POE-Component-Client-HTTP/](http://search.cpan.org/dist/POE-Component-Client-HTTP/)
547