1<?xml version="1.0" encoding="ISO-8859-1"?>
2<!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Strict//EN" "http://www.w3.org/TR/xhtml1/DTD/xhtml1-strict.dtd">
3<html xmlns="http://www.w3.org/1999/xhtml" lang="en" xml:lang="en"><head><!--
4        XXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXX
5              This file is generated from xml source: DO NOT EDIT
6        XXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXX
7      -->
8<title>Advanced Techniques with mod_rewrite - Apache HTTP Server</title>
9<link href="../style/css/manual.css" rel="stylesheet" media="all" type="text/css" title="Main stylesheet" />
10<link href="../style/css/manual-loose-100pc.css" rel="alternate stylesheet" media="all" type="text/css" title="No Sidebar - Default font size" />
11<link href="../style/css/manual-print.css" rel="stylesheet" media="print" type="text/css" /><link rel="stylesheet" type="text/css" href="../style/css/prettify.css" />
12<script src="../style/scripts/prettify.js" type="text/javascript">
13</script>
14
15<link href="../images/favicon.ico" rel="shortcut icon" /></head>
16<body id="manual-page"><div id="page-header">
17<p class="menu"><a href="../mod/">Modules</a> | <a href="../mod/directives.html">Directives</a> | <a href="http://wiki.apache.org/httpd/FAQ">FAQ</a> | <a href="../glossary.html">Glossary</a> | <a href="../sitemap.html">Sitemap</a></p>
18<p class="apache">Apache HTTP Server Version 2.4</p>
19<img alt="" src="../images/feather.gif" /></div>
20<div class="up"><a href="./"><img title="&lt;-" alt="&lt;-" src="../images/left.gif" /></a></div>
21<div id="path">
22<a href="http://www.apache.org/">Apache</a> &gt; <a href="http://httpd.apache.org/">HTTP Server</a> &gt; <a href="http://httpd.apache.org/docs/">Documentation</a> &gt; <a href="../">Version 2.4</a> &gt; <a href="./">Rewrite</a></div><div id="page-content"><div id="preamble"><h1>Advanced Techniques with mod_rewrite</h1>
23<div class="toplang">
24<p><span>Available Languages: </span><a href="../en/rewrite/avoid.html" title="English">&nbsp;en&nbsp;</a> |
25<a href="../fr/rewrite/avoid.html" hreflang="fr" rel="alternate" title="Fran�ais">&nbsp;fr&nbsp;</a></p>
26</div>
27
28
29<p>This document supplements the <code class="module"><a href="../mod/mod_rewrite.html">mod_rewrite</a></code>
30<a href="../mod/mod_rewrite.html">reference documentation</a>. It provides
31a few advanced techniques using mod_rewrite.</p>
32
33<div class="warning">Note that many of these examples won't work unchanged in your
34particular server configuration, so it's important that you understand
35them, rather than merely cutting and pasting the examples into your
36configuration.</div>
37
38</div>
39<div id="quickview"><ul id="toc"><li><img alt="" src="../images/down.gif" /> <a href="#sharding">URL-based sharding accross multiple backends</a></li>
40<li><img alt="" src="../images/down.gif" /> <a href="#on-the-fly-content">On-the-fly Content-Regeneration</a></li>
41<li><img alt="" src="../images/down.gif" /> <a href="#load-balancing">Load Balancing</a></li>
42<li><img alt="" src="../images/down.gif" /> <a href="#autorefresh">Document With Autorefresh</a></li>
43<li><img alt="" src="../images/down.gif" /> <a href="#structuredhomedirs">Structured Userdirs</a></li>
44<li><img alt="" src="../images/down.gif" /> <a href="#redirectanchors">Redirecting Anchors</a></li>
45<li><img alt="" src="../images/down.gif" /> <a href="#time-dependent">Time-Dependent Rewriting</a></li>
46<li><img alt="" src="../images/down.gif" /> <a href="#setenvvars">Set Environment Variables Based On URL Parts</a></li>
47</ul><h3>See also</h3><ul class="seealso"><li><a href="../mod/mod_rewrite.html">Module documentation</a></li><li><a href="intro.html">mod_rewrite introduction</a></li><li><a href="remapping.html">Redirection and remapping</a></li><li><a href="access.html">Controlling access</a></li><li><a href="vhosts.html">Virtual hosts</a></li><li><a href="proxy.html">Proxying</a></li><li><a href="rewritemap.html">Using RewriteMap</a></li><li><a href="avoid.html">When not to use mod_rewrite</a></li></ul><ul class="seealso"><li><a href="#comments_section">Comments</a></li></ul></div>
48<div class="top"><a href="#page-header"><img alt="top" src="../images/up.gif" /></a></div>
49<div class="section">
50<h2><a name="sharding" id="sharding">URL-based sharding accross multiple backends</a></h2>
51
52
53
54  <dl>
55    <dt>Description:</dt>
56
57    <dd>
58      <p>A common technique for distributing the burden of
59      server load or storage space is called "sharding".
60      When using this method, a front-end server will use the
61      url to consistently "shard" users or objects to separate
62      backend servers.</p>
63    </dd>
64
65    <dt>Solution:</dt>
66
67    <dd>
68      <p>A mapping is maintained, from users to target servers, in
69      external map files. They look like:</p>
70
71<div class="example"><p><code>
72user1  physical_host_of_user1<br />
73user2  physical_host_of_user2<br />
74:      :
75</code></p></div>
76
77  <p>We put this into a <code>map.users-to-hosts</code> file. The
78    aim is to map;</p>
79
80<div class="example"><p><code>
81/u/user1/anypath
82</code></p></div>
83
84  <p>to</p>
85
86<div class="example"><p><code>
87http://physical_host_of_user1/u/user/anypath
88</code></p></div>
89
90      <p>thus every URL path need not be valid on every backend physical
91      host. The following ruleset does this for us with the help of the map
92      files assuming that server0 is a default server which will be used if
93      a user has no entry in the map:</p>
94
95<pre class="prettyprint lang-config">
96RewriteEngine on
97RewriteMap      users-to-hosts   txt:/path/to/map.users-to-hosts
98RewriteRule   ^/u/([^/]+)/?(.*)   http://${users-to-hosts:$1|server0}/u/$1/$2
99</pre>
100
101    </dd>
102  </dl>
103
104  <p>See the <code class="directive"><a href="../mod/mod_rewrite.html#rewritemap">RewriteMap</a></code>
105  documentation for more discussion of the syntax of this directive.</p>
106
107</div><div class="top"><a href="#page-header"><img alt="top" src="../images/up.gif" /></a></div>
108<div class="section">
109<h2><a name="on-the-fly-content" id="on-the-fly-content">On-the-fly Content-Regeneration</a></h2>
110
111
112
113  <dl>
114    <dt>Description:</dt>
115
116    <dd>
117      <p>We wish to dynamically generate content, but store it
118      statically once it is generated. This rule will check for the
119      existence of the static file, and if it's not there, generate
120      it. The static files can be removed periodically, if desired (say,
121      via cron) and will be regenerated on demand.</p>
122    </dd>
123
124    <dt>Solution:</dt>
125
126    <dd>
127      This is done via the following ruleset:
128
129<pre class="prettyprint lang-config">
130# This example is valid in per-directory context only
131RewriteCond %{REQUEST_URI}   !-U
132RewriteRule ^(.+)\.html$          /regenerate_page.cgi   [PT,L]
133</pre>
134
135
136    <p>The <code>-U</code> operator determines whether the test string
137    (in this case, <code>REQUEST_URI</code>) is a valid URL. It does
138    this via a subrequest. In the event that this subrequest fails -
139    that is, the requested resource doesn't exist - this rule invokes
140    the CGI program <code>/regenerate_page.cgi</code>, which generates
141    the requested resource and saves it into the document directory, so
142    that the next time it is requested, a static copy can be served.</p>
143
144    <p>In this way, documents that are infrequently updated can be served in
145    static form. if documents need to be refreshed, they can be deleted
146    from the document directory, and they will then be regenerated the
147    next time they are requested.</p>
148    </dd>
149  </dl>
150
151</div><div class="top"><a href="#page-header"><img alt="top" src="../images/up.gif" /></a></div>
152<div class="section">
153<h2><a name="load-balancing" id="load-balancing">Load Balancing</a></h2>
154
155
156
157  <dl>
158    <dt>Description:</dt>
159
160    <dd>
161      <p>We wish to randomly distribute load across several servers
162      using mod_rewrite.</p>
163    </dd>
164
165    <dt>Solution:</dt>
166
167    <dd>
168      <p>We'll use <code class="directive"><a href="../mod/mod_rewrite.html#rewritemap">RewriteMap</a></code> and a list of servers
169      to accomplish this.</p>
170
171<pre class="prettyprint lang-config">
172RewriteEngine on
173RewriteMap lb rnd:/path/to/serverlist.txt
174RewriteRule ^/(.*) http://${lb:servers}/$1 [P,L]
175</pre>
176
177
178<p><code>serverlist.txt</code> will contain a list of the servers:</p>
179
180<div class="example"><p><code>
181## serverlist.txt<br />
182<br />
183servers one.example.com|two.example.com|three.example.com<br />
184</code></p></div>
185
186<p>If you want one particular server to get more of the load than the
187others, add it more times to the list.</p>
188
189   </dd>
190
191   <dt>Discussion</dt>
192   <dd>
193<p>Apache comes with a load-balancing module -
194<code class="module"><a href="../mod/mod_proxy_balancer.html">mod_proxy_balancer</a></code> - which is far more flexible and
195featureful than anything you can cobble together using mod_rewrite.</p>
196   </dd>
197  </dl>
198
199</div><div class="top"><a href="#page-header"><img alt="top" src="../images/up.gif" /></a></div>
200<div class="section">
201<h2><a name="autorefresh" id="autorefresh">Document With Autorefresh</a></h2>
202
203
204
205
206
207  <dl>
208    <dt>Description:</dt>
209
210    <dd>
211      <p>Wouldn't it be nice, while creating a complex web page, if
212      the web browser would automatically refresh the page every
213      time we save a new version from within our editor?
214      Impossible?</p>
215    </dd>
216
217    <dt>Solution:</dt>
218
219    <dd>
220      <p>No! We just combine the MIME multipart feature, the
221      web server NPH feature, and the URL manipulation power of
222      <code class="module"><a href="../mod/mod_rewrite.html">mod_rewrite</a></code>. First, we establish a new
223      URL feature: Adding just <code>:refresh</code> to any
224      URL causes the 'page' to be refreshed every time it is
225      updated on the filesystem.</p>
226
227<pre class="prettyprint lang-config">
228RewriteRule   ^(/[uge]/[^/]+/?.*):refresh  /internal/cgi/apache/nph-refresh?f=$1
229</pre>
230
231
232      <p>Now when we reference the URL</p>
233
234<div class="example"><p><code>
235/u/foo/bar/page.html:refresh
236</code></p></div>
237
238      <p>this leads to the internal invocation of the URL</p>
239
240<div class="example"><p><code>
241/internal/cgi/apache/nph-refresh?f=/u/foo/bar/page.html
242</code></p></div>
243
244      <p>The only missing part is the NPH-CGI script. Although
245      one would usually say "left as an exercise to the reader"
246      ;-) I will provide this, too.</p>
247
248<pre class="prettyprint lang-perl">
249#!/sw/bin/perl
250##
251##  nph-refresh -- NPH/CGI script for auto refreshing pages
252##  Copyright (c) 1997 Ralf S. Engelschall, All Rights Reserved.
253##
254$| = 1;
255
256#   split the QUERY_STRING variable
257@pairs = split( /&amp;/, $ENV{'QUERY_STRING'} );
258foreach $pair (@pairs) {
259    ( $name, $value ) = split( /=/, $pair );
260    $name =~ tr/A-Z/a-z/;
261    $name = 'QS_' . $name;
262    $value =~ s/%([a-fA-F0-9][a-fA-F0-9])/pack("C", hex($1))/eg;
263    eval "\$$name = \"$value\"";
264}
265$QS_s = 1    if ( $QS_s eq '' );
266$QS_n = 3600 if ( $QS_n eq '' );
267if ( $QS_f eq '' ) {
268    print "HTTP/1.0 200 OK\n";
269    print "Content-type: text/html\n\n";
270    print "&lt;b&gt;ERROR&lt;/b&gt;: No file given\n";
271    exit(0);
272}
273if ( !-f $QS_f ) {
274    print "HTTP/1.0 200 OK\n";
275    print "Content-type: text/html\n\n";
276    print "&lt;b&gt;ERROR&lt;/b&gt;: File $QS_f not found\n";
277    exit(0);
278}
279
280sub print_http_headers_multipart_begin {
281    print "HTTP/1.0 200 OK\n";
282    $bound = "ThisRandomString12345";
283    print "Content-type: multipart/x-mixed-replace;boundary=$bound\n";
284    &amp;print_http_headers_multipart_next;
285}
286
287sub print_http_headers_multipart_next {
288    print "\n--$bound\n";
289}
290
291sub print_http_headers_multipart_end {
292    print "\n--$bound--\n";
293}
294
295sub displayhtml {
296    local ($buffer) = @_;
297    $len = length($buffer);
298    print "Content-type: text/html\n";
299    print "Content-length: $len\n\n";
300    print $buffer;
301}
302
303sub readfile {
304    local ($file) = @_;
305    local ( *FP, $size, $buffer, $bytes );
306    ( $x, $x, $x, $x, $x, $x, $x, $size ) = stat($file);
307    $size = sprintf( "%d", $size );
308    open( FP, "&lt;$file" );
309    $bytes = sysread( FP, $buffer, $size );
310    close(FP);
311    return $buffer;
312}
313
314$buffer = &amp;readfile($QS_f);
315&amp;print_http_headers_multipart_begin;
316&amp;displayhtml($buffer);
317
318sub mystat {
319    local ($file) = $_[0];
320    local ($time);
321
322    ( $x, $x, $x, $x, $x, $x, $x, $x, $x, $mtime ) = stat($file);
323    return $mtime;
324}
325
326$mtimeL = &amp;mystat($QS_f);
327$mtime  = $mtime;
328for ( $n = 0 ; $n &amp; lt ; $QS_n ; $n++ ) {
329    while (1) {
330        $mtime = &amp;mystat($QS_f);
331        if ( $mtime ne $mtimeL ) {
332            $mtimeL = $mtime;
333            sleep(2);
334            $buffer = &amp;readfile($QS_f);
335            &amp;print_http_headers_multipart_next;
336            &amp;displayhtml($buffer);
337            sleep(5);
338            $mtimeL = &amp;mystat($QS_f);
339            last;
340        }
341        sleep($QS_s);
342    }
343}
344
345&amp;print_http_headers_multipart_end;
346
347exit(0);
348
349##EOF##
350</pre>
351
352    </dd>
353  </dl>
354
355</div><div class="top"><a href="#page-header"><img alt="top" src="../images/up.gif" /></a></div>
356<div class="section">
357<h2><a name="structuredhomedirs" id="structuredhomedirs">Structured Userdirs</a></h2>
358
359
360
361  <dl>
362    <dt>Description:</dt>
363
364    <dd>
365      <p>Some sites with thousands of users use a
366      structured homedir layout, <em>i.e.</em> each homedir is in a
367      subdirectory which begins (for instance) with the first
368      character of the username. So, <code>/~larry/anypath</code>
369      is <code>/home/<strong>l</strong>/larry/public_html/anypath</code>
370      while <code>/~waldo/anypath</code> is
371      <code>/home/<strong>w</strong>/waldo/public_html/anypath</code>.</p>
372    </dd>
373
374    <dt>Solution:</dt>
375
376    <dd>
377      <p>We use the following ruleset to expand the tilde URLs
378      into the above layout.</p>
379
380<pre class="prettyprint lang-config">
381RewriteEngine on
382RewriteRule   ^/~(<strong>([a-z])</strong>[a-z0-9]+)(.*)  /home/<strong>$2</strong>/$1/public_html$3
383</pre>
384
385    </dd>
386  </dl>
387
388</div><div class="top"><a href="#page-header"><img alt="top" src="../images/up.gif" /></a></div>
389<div class="section">
390<h2><a name="redirectanchors" id="redirectanchors">Redirecting Anchors</a></h2>
391
392
393
394  <dl>
395    <dt>Description:</dt>
396
397    <dd>
398    <p>By default, redirecting to an HTML anchor doesn't work,
399    because mod_rewrite escapes the <code>#</code> character,
400    turning it into <code>%23</code>. This, in turn, breaks the
401    redirection.</p>
402    </dd>
403
404    <dt>Solution:</dt>
405
406    <dd>
407      <p>Use the <code>[NE]</code> flag on the
408      <code>RewriteRule</code>. NE stands for No Escape.
409      </p>
410    </dd>
411
412    <dt>Discussion:</dt>
413    <dd>This technique will of course also work with other
414    special characters that mod_rewrite, by default, URL-encodes.</dd>
415  </dl>
416
417</div><div class="top"><a href="#page-header"><img alt="top" src="../images/up.gif" /></a></div>
418<div class="section">
419<h2><a name="time-dependent" id="time-dependent">Time-Dependent Rewriting</a></h2>
420
421
422
423  <dl>
424    <dt>Description:</dt>
425
426    <dd>
427      <p>We wish to use mod_rewrite to serve different content based on
428      the time of day.</p>
429    </dd>
430
431    <dt>Solution:</dt>
432
433    <dd>
434      <p>There are a lot of variables named <code>TIME_xxx</code>
435      for rewrite conditions. In conjunction with the special
436      lexicographic comparison patterns <code>&lt;STRING</code>,
437      <code>&gt;STRING</code> and <code>=STRING</code> we can
438      do time-dependent redirects:</p>
439
440<pre class="prettyprint lang-config">
441RewriteEngine on
442RewriteCond   %{TIME_HOUR}%{TIME_MIN} &gt;0700
443RewriteCond   %{TIME_HOUR}%{TIME_MIN} &lt;1900
444RewriteRule   ^foo\.html$             foo.day.html [L]
445RewriteRule   ^foo\.html$             foo.night.html
446</pre>
447
448
449      <p>This provides the content of <code>foo.day.html</code>
450      under the URL <code>foo.html</code> from
451      <code>07:01-18:59</code> and at the remaining time the
452      contents of <code>foo.night.html</code>.</p>
453
454      <div class="warning"><code class="module"><a href="../mod/mod_cache.html">mod_cache</a></code>, intermediate proxies
455      and browsers may each cache responses and cause the either page to be
456      shown outside of the time-window configured.
457      <code class="module"><a href="../mod/mod_expires.html">mod_expires</a></code> may be used to control this
458      effect. You are, of course, much better off simply serving the
459      content dynamically, and customizing it based on the time of day.</div>
460
461    </dd>
462  </dl>
463
464</div><div class="top"><a href="#page-header"><img alt="top" src="../images/up.gif" /></a></div>
465<div class="section">
466<h2><a name="setenvvars" id="setenvvars">Set Environment Variables Based On URL Parts</a></h2>
467
468
469
470  <dl>
471    <dt>Description:</dt>
472
473    <dd>
474      <p>At time, we want to maintain some kind of status when we
475      perform a rewrite. For example, you want to make a note that
476      you've done that rewrite, so that you can check later to see if a
477      request can via that rewrite. One way to do this is by setting an
478      environment variable.</p>
479    </dd>
480
481    <dt>Solution:</dt>
482
483    <dd>
484      <p>Use the [E] flag to set an environment variable.</p>
485
486<pre class="prettyprint lang-config">
487RewriteEngine on
488RewriteRule   ^/horse/(.*)   /pony/$1 [E=<strong>rewritten:1</strong>]
489</pre>
490
491
492    <p>Later in your ruleset you might check for this environment
493    variable using a RewriteCond:</p>
494
495<pre class="prettyprint lang-config">
496RewriteCond %{ENV:rewritten} =1
497</pre>
498
499
500    <p>Note that environment variables do not survive an external
501    redirect. You might consider using the [CO] flag to set a
502    cookie.</p>
503
504    </dd>
505  </dl>
506
507</div></div>
508<div class="bottomlang">
509<p><span>Available Languages: </span><a href="../en/rewrite/avoid.html" title="English">&nbsp;en&nbsp;</a> |
510<a href="../fr/rewrite/avoid.html" hreflang="fr" rel="alternate" title="Fran�ais">&nbsp;fr&nbsp;</a></p>
511</div><div class="top"><a href="#page-header"><img src="../images/up.gif" alt="top" /></a></div><div class="section"><h2><a id="comments_section" name="comments_section">Comments</a></h2><div class="warning"><strong>Notice:</strong><br />This is not a Q&amp;A section. Comments placed here should be pointed towards suggestions on improving the documentation or server, and may be removed again by our moderators if they are either implemented or considered invalid/off-topic. Questions on how to manage the Apache HTTP Server should be directed at either our IRC channel, #httpd, on Freenode, or sent to our <a href="http://httpd.apache.org/lists.html">mailing lists</a>.</div>
512<script type="text/javascript"><!--//--><![CDATA[//><!--
513var comments_shortname = 'httpd';
514var comments_identifier = 'http://httpd.apache.org/docs/2.4/rewrite/avoid.html';
515(function(w, d) {
516    if (w.location.hostname.toLowerCase() == "httpd.apache.org") {
517        d.write('<div id="comments_thread"><\/div>');
518        var s = d.createElement('script');
519        s.type = 'text/javascript';
520        s.async = true;
521        s.src = 'https://comments.apache.org/show_comments.lua?site=' + comments_shortname + '&page=' + comments_identifier;
522        (d.getElementsByTagName('head')[0] || d.getElementsByTagName('body')[0]).appendChild(s);
523    }
524    else {
525        d.write('<div id="comments_thread">Comments are disabled for this page at the moment.<\/div>');
526    }
527})(window, document);
528//--><!]]></script></div><div id="footer">
529<p class="apache">Copyright 2013 The Apache Software Foundation.<br />Licensed under the <a href="http://www.apache.org/licenses/LICENSE-2.0">Apache License, Version 2.0</a>.</p>
530<p class="menu"><a href="../mod/">Modules</a> | <a href="../mod/directives.html">Directives</a> | <a href="http://wiki.apache.org/httpd/FAQ">FAQ</a> | <a href="../glossary.html">Glossary</a> | <a href="../sitemap.html">Sitemap</a></p></div><script type="text/javascript"><!--//--><![CDATA[//><!--
531if (typeof(prettyPrint) !== 'undefined') {
532    prettyPrint();
533}
534//--><!]]></script>
535</body></html>